|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +author: Bob |
| 4 | +categories: |
| 5 | + - ports & adapters |
| 6 | +tags: |
| 7 | + - python |
| 8 | + - architecture |
| 9 | +--- |
| 10 | + |
| 11 | +The term DDD comes from the book by Eric Evans: "Domain-Driven Design: Tackling |
| 12 | +Complexity in the Heart of Software |
| 13 | +[https://www.amazon.co.uk/Domain-driven-Design-Tackling-Complexity-Software/dp/0321125215] |
| 14 | +". In his book he describes a set of practices that aim to help us build |
| 15 | +maintainable, rich, software systems that solve customer's problems. The book is |
| 16 | +560 pages of dense insight, so you'll pardon me if my summary elides some |
| 17 | +details, but in brief he suggests: |
| 18 | + |
| 19 | + * Listen very carefully to your domain experts - the people whose job you're |
| 20 | + automating or assisting in software. |
| 21 | + * Learn the jargon that they use, and help them to come up with new jargon, so |
| 22 | + that every concept in their mental model is named by a single precise term. |
| 23 | + * Use those terms to model your software; the nouns and verbs of the domain |
| 24 | + expert are the classes and methods you should use in modelling. |
| 25 | + * Whenever there is a discrepancy between your shared understanding of the |
| 26 | + domain, go and talk to the domain experts again, and then refactor |
| 27 | + aggressively. |
| 28 | + |
| 29 | +This sounds great in theory, but in practice we often find that our business |
| 30 | +logic escapes from our model objects; we end up with logic bleeding into |
| 31 | +controllers, or into fat "manager" classes. We find that refactoring becomes |
| 32 | +difficult: we can't split a large and important class, because that would |
| 33 | +seriously impact the database schema; or we can't rewrite the internals of an |
| 34 | +algorithm because it has become tightly coupled to code that exists for a |
| 35 | +different use-case. The good news is that these problems can be avoided, since |
| 36 | +they are caused by a lack of organisation in the codebase. In fact, the tools to |
| 37 | +solve these problems take up half of the DDD book, but it can be be difficult to |
| 38 | +understand how to use them together in the context of a complete system. |
| 39 | + |
| 40 | +I want to use this series to introduce an architectural style called Ports and |
| 41 | +Adapters [http://wiki.c2.com/?PortsAndAdaptersArchitecture], and a design |
| 42 | +pattern named Command Handler |
| 43 | +[https://matthiasnoback.nl/2015/01/responsibilities-of-the-command-bus/]. I'll |
| 44 | +be explaining the patterns in Python because that's the language that I use |
| 45 | +day-to-day, but the concepts are applicable to any OO language, and can be |
| 46 | +massaged to work perfectly in a functional context. There might be a lot more |
| 47 | +layering and abstraction than you're used to, especially if you're coming from a |
| 48 | +Django background or similar, but please bear with me. In exchange for a more |
| 49 | +complex system at the outset, we can avoid much of our accidental complexity |
| 50 | +[http://wiki.c2.com/?AccidentalComplexity] later. |
| 51 | + |
| 52 | +The system we're going to build is an issue management system, for use by a |
| 53 | +helpdesk. We're going to be replacing an existing system, which consists of an |
| 54 | +HTML form that sends an email. The emails go into a mailbox, and helpdesk staff |
| 55 | +go through the mails triaging problems and picking up problems that they can |
| 56 | +solve. Sometimes issues get overlooked for a long time, and the helpdesk team |
| 57 | +have invented a complex system of post-it notes and whiteboard layouts to track |
| 58 | +work in progress. For a while this system has worked pretty well but, as the |
| 59 | +system gets busier, the cracks are beginning to show. |
| 60 | + |
| 61 | +Our first conversation with the domain expert |
| 62 | +"What's the first step in the process?" you ask, "How do tickets end up in the |
| 63 | +mail box?". |
| 64 | + |
| 65 | +"Well, the first thing that happens is the user goes to the web page, and they |
| 66 | +fill out some details, and report an issue. That sends an email into the issue |
| 67 | +log and then we pick issues from the log each morning". |
| 68 | + |
| 69 | +"So when a user reports an issue, what's the minimal set of data that you need |
| 70 | +from them?" |
| 71 | + |
| 72 | +"We need to know who they are, so their name, and email I guess. Uh... and the |
| 73 | +problem description. They're supposed to add a category, but they never do, and |
| 74 | +we used to have a priority, but everyone set their issue to EXTREMELY URGENT, so |
| 75 | +it was useless. |
| 76 | + |
| 77 | +"But a category and priority would help you to triage things?" |
| 78 | + |
| 79 | +"Yes, that would be really helpful if we could get users to set them properly." |
| 80 | + |
| 81 | +This gives us our first use case: As a user, I want to be able to report a new |
| 82 | +issue. |
| 83 | + |
| 84 | +Okay, before we get to the code, let's talk about architecture. The architecture |
| 85 | +of a software system is the overall structure - the choice of language, |
| 86 | +technology, and design patterns that organise the code and satisfy our |
| 87 | +constraints [https://en.wikipedia.org/wiki/Non-functional_requirement]. For our |
| 88 | +architecture, we're going to try and stick with three principles: |
| 89 | + |
| 90 | + 1. We will always define where our use-cases begin and end. We won't have |
| 91 | + business processes that are strewn all over the codebase. |
| 92 | + 2. We will depend on abstractions |
| 93 | + [https://en.wikipedia.org/wiki/Dependency_inversion_principle], and not on |
| 94 | + concrete implementations. |
| 95 | + 3. We will treat glue code as distinct from business logic, and put it in an |
| 96 | + appropriate place. |
| 97 | + |
| 98 | +Firstly we start with the domain model. The domain model encapsulates our shared |
| 99 | +understanding of the problem, and uses the terms we agreed with the domain |
| 100 | +experts. In keeping with principle #2 we will define abstractions for any |
| 101 | +infrastructural or technical concerns and use those in our model. For example, |
| 102 | +if we need to send an email, or save an entity to a database, we will do so |
| 103 | +through an abstraction that captures our intent. In this series we'll create a |
| 104 | +separate python package for our domain model so that we can be sure it has no |
| 105 | +dependencies on the other layers of the system. Maintaining this rule strictly |
| 106 | +will make it easier to test and refactor our system, since our domain models |
| 107 | +aren't tangled up with messy details of databases and http calls. |
| 108 | + |
| 109 | + |
| 110 | + |
| 111 | +Around the outside of our domain model we place services. These are stateless |
| 112 | +objects that do stuff to the domain. In particular, for this system, our command |
| 113 | +handlers are part of the service layer. |
| 114 | + |
| 115 | + |
| 116 | + |
| 117 | +Finally, we have our adapter layer. This layer contains code that drives the |
| 118 | +service layer, or provides services to the domain model. For example, our domain |
| 119 | +model may have an abstraction for talking to the database, but the adapter layer |
| 120 | +provides a concrete implementation. Other adapters might include a Flask API, or |
| 121 | +our set of unit tests, or a celery event queue. All of these adapters connect |
| 122 | +our application to the outside world. |
| 123 | + |
| 124 | + |
| 125 | + |
| 126 | +In keeping with our first principle, we're going to define a boundary for this |
| 127 | +use case and create our first Command Handler. A command handler is an object |
| 128 | +that orchestrates a business process. It does the boring work of fetching the |
| 129 | +right objects, and invoking the right methods on them. It's similar to the |
| 130 | +concept of a Controller in an MVC architecture. |
| 131 | + |
| 132 | +First, we create a Command object. |
| 133 | + |
| 134 | +class ReportIssueCommand(NamedTuple): |
| 135 | + reporter_name: str |
| 136 | + reporter_email: str |
| 137 | + problem_description: str |
| 138 | + |
| 139 | + |
| 140 | +A command object is a small object that represents a state-changing action that |
| 141 | +can happen in the system. Commands have no behaviour, they're pure data |
| 142 | +structures. There's no reason why you have to represent them with classes, since |
| 143 | +all they need is a name and a bag of data, but a NamedTuple is a nice compromise |
| 144 | +between simplicity and convenience. Commands are instructions from an external |
| 145 | +agent (a user, a cron job, another service etc.) and have names in the |
| 146 | +imperative tense, for example: |
| 147 | + |
| 148 | + * ReportIssue |
| 149 | + * PrepareUploadUri |
| 150 | + * CancelOutstandingOrders |
| 151 | + * RemoveItemFromCart |
| 152 | + * OpenLoginSession |
| 153 | + * PlaceCustomerOrder |
| 154 | + * BeginPaymentProcess |
| 155 | + |
| 156 | +We should try to avoid the verbs Create, Update, or Delete (and their synonyms) |
| 157 | +because those are technical implementations. When we listen to our domain |
| 158 | +experts, we often find that there is a better word for the operation we're |
| 159 | +trying to model. If all of your commands are named "CreateIssue", "UpdateCart", |
| 160 | +"DeleteOrders", then you're probably not paying enough attention to the language |
| 161 | +that your stakeholders are using. |
| 162 | + |
| 163 | +The command objects belong to the domain, and they express the API of your |
| 164 | +domain. If every state-changing action is performed via a command handler, then |
| 165 | +the list of Commands is the complete list of supported operations in your domain |
| 166 | +model. This has two major benefits: |
| 167 | + |
| 168 | + 1. If the only way to change state in the system is through a command, then the |
| 169 | + list of commands tells me all the things I need to test. There are no other |
| 170 | + code paths that can modify data. |
| 171 | + 2. Because our commands are lightweight, logic-free objects, we can create them |
| 172 | + from an HTTP post, or a celery task, or a command line csv reader, or a unit |
| 173 | + test. They form a simple and stable API for our system that does not depend |
| 174 | + on any implementation details and can be invoked in multiple ways. |
| 175 | + |
| 176 | +In order to process our new command, we'll need to create a command handler. |
| 177 | + |
| 178 | +class ReportIssueCommandHandler: |
| 179 | + def __init__(self, issue_log): |
| 180 | + self.issue_log = issue_log |
| 181 | + |
| 182 | + def __call__(self, cmd): |
| 183 | + reported_by = IssueReporter( |
| 184 | + cmd.reporter_name, |
| 185 | + cmd.reporter_email) |
| 186 | + issue = Issue(reported_by, cmd.problem_description) |
| 187 | + self.issue_log.add(issue) |
| 188 | + |
| 189 | + |
| 190 | + |
| 191 | +Command handlers are stateless objects that orchestrate the behaviour of a |
| 192 | +system. They are a kind of glue code, and manage the boring work of fetching and |
| 193 | +saving objects, and then notifying other parts of the system. In keeping with |
| 194 | +principle #3, we keep this in a separate layer. To satisfy principle #1, each |
| 195 | +use case is a separate command handler and has a clearly defined beginning and |
| 196 | +end. Every command is handled by exactly one command handler. |
| 197 | + |
| 198 | +In general all command handlers will have the same structure: |
| 199 | + |
| 200 | + 1. Fetch the current state from our persistent storage. |
| 201 | + 2. Update the current state. |
| 202 | + 3. Persist the new state. |
| 203 | + 4. Notify any external systems that our state has changed. |
| 204 | + |
| 205 | +We will usually avoid if statements, loops, and other such wizardry in our |
| 206 | +handlers, and stick to a single possible line of execution. Command handlers are |
| 207 | + boring glue code. |
| 208 | +Since our command handlers are just glue code, we won't put any business logic |
| 209 | +into them - they shouldn't be making any business decisions. For example, let's |
| 210 | +skip ahead a little to a new command handler: |
| 211 | + |
| 212 | +class MarkIssueAsResolvedHandler: |
| 213 | + def __init__(self, issue_log): |
| 214 | + self.issue_log = issue_log |
| 215 | + |
| 216 | + def __call__(self, cmd): |
| 217 | + issue = self.issue_log.get(cmd.issue_id) |
| 218 | + # the following line encodes a business rule |
| 219 | + if (issue.state != IssueStatus.Resolved): |
| 220 | + issue.mark_as_resolved(cmd.resolution) |
| 221 | + |
| 222 | + |
| 223 | +This handler violates our glue-code principle because it encodes a business |
| 224 | +rule: "If an issue is already resolved, then it can't be resolved a second |
| 225 | +time". This rule belongs in our domain model, probably in the mark_as_resolved |
| 226 | +method of our Issue object. |
| 227 | +I tend to use classes for my command handlers, and to invoke them with the call |
| 228 | +magic method, but a function is perfectly valid as a handler, too. The major |
| 229 | +reason to prefer a class is that it can make dependency management a little |
| 230 | +easier, but the two approaches are completely equivalent. For example, we could |
| 231 | +rewrite our ReportIssueHandler like this: |
| 232 | + |
| 233 | +def ReportIssue(issue_log, cmd): |
| 234 | + reported_by = IssueReporter( |
| 235 | + cmd.reporter_name, |
| 236 | + cmd.reporter_email) |
| 237 | + issue = Issue(reported_by, cmd.problem_description) |
| 238 | + issue_log.add(issue) |
| 239 | + |
| 240 | + |
| 241 | +If magic methods make you feel queasy, you can define a handler to be a class |
| 242 | +that exposes a handle method like this: |
| 243 | + |
| 244 | +class ReportIssueHandler: |
| 245 | + def handle(self, cmd): |
| 246 | + ... |
| 247 | + |
| 248 | + |
| 249 | +However you structure them, the important ideas of commands and handlers are: |
| 250 | + |
| 251 | + 1. Commands are logic-free data structures with a name and a bunch of values. |
| 252 | + 2. They form a stable, simple API that describes what our system can do, and |
| 253 | + doesn't depend on any implementation details. |
| 254 | + 3. Each command can be handled by exactly one handler. |
| 255 | + 4. Each command instructs the system to run through one use case. |
| 256 | + 5. A handler will usually do the following steps: get state, change state, |
| 257 | + persist state, notify other parties that state was changed. |
| 258 | + |
| 259 | +Let's take a look at the complete system, I'm concatenating all the files into a |
| 260 | +single code listing for each of grokking, but in the git repository |
| 261 | +[https://github.com/bobthemighty/blog-code-samples/tree/master/ports-and-adapters/01] |
| 262 | + I'm splitting the layers of the system into separate packages. In the real |
| 263 | +world, I would probably use a single python package for the whole app, but in |
| 264 | +other languages - Java, C#, C++ - I would usually have a single binary for each |
| 265 | +layer. Splitting the packages up this way makes it easier to understand how the |
| 266 | +dependencies work. |
| 267 | + |
| 268 | +from typing import NamedTuple |
| 269 | +from expects import expect, have_len, equal |
| 270 | + |
| 271 | +# Domain model |
| 272 | + |
| 273 | +class IssueReporter: |
| 274 | + def __init__(self, name, email): |
| 275 | + self.name = name |
| 276 | + self.email = email |
| 277 | + |
| 278 | + |
| 279 | +class Issue: |
| 280 | + def __init__(self, reporter, description): |
| 281 | + self.description = description |
| 282 | + self.reporter = reporter |
| 283 | + |
| 284 | + |
| 285 | +class IssueLog: |
| 286 | + def add(self, issue): |
| 287 | + pass |
| 288 | + |
| 289 | + |
| 290 | +class ReportIssueCommand(NamedTuple): |
| 291 | + reporter_name: str |
| 292 | + reporter_email: str |
| 293 | + problem_description: str |
| 294 | + |
| 295 | + |
| 296 | +# Service Layer |
| 297 | + |
| 298 | +class ReportIssueHandler: |
| 299 | + |
| 300 | + def __init__(self, issue_log): |
| 301 | + self.issue_log = issue_log |
| 302 | + |
| 303 | + def __call__(self, cmd): |
| 304 | + reported_by = IssueReporter( |
| 305 | + cmd.reporter_name, |
| 306 | + cmd.reporter_email) |
| 307 | + issue = Issue(reported_by, cmd.problem_description) |
| 308 | + self.issue_log.add(issue) |
| 309 | + |
| 310 | + |
| 311 | +# Adapters |
| 312 | + |
| 313 | +class FakeIssueLog(IssueLog): |
| 314 | + |
| 315 | + def __init__(self): |
| 316 | + self.issues = [] |
| 317 | + |
| 318 | + def add(self, issue): |
| 319 | + self.issues.append(issue) |
| 320 | + |
| 321 | + def get(self, id): |
| 322 | + return self.issues[id] |
| 323 | + |
| 324 | + def __len__(self): |
| 325 | + return len(self.issues) |
| 326 | + |
| 327 | + def __getitem__(self, idx): |
| 328 | + return self.issues[idx] |
| 329 | + |
| 330 | + |
| 331 | + |
| 332 | +name = "bob" |
| 333 | +desc = "My mouse won't move" |
| 334 | + |
| 335 | + |
| 336 | +class When_reporting_an_issue: |
| 337 | + |
| 338 | + def given_an_empty_issue_log(self): |
| 339 | + self.issues = FakeIssueLog() |
| 340 | + |
| 341 | + def because_we_report_a_new_issue(self): |
| 342 | + handler = ReportIssueHandler(self.issues) |
| 343 | + cmd = ReportIssueCommand(name, email, desc) |
| 344 | + |
| 345 | + handler(cmd) |
| 346 | + |
| 347 | + def the_handler_should_have_created_a_new_issue(self): |
| 348 | + expect(self.issues).to(have_len(1)) |
| 349 | + |
| 350 | + def it_should_have_recorded_the_issuer(self): |
| 351 | + expect(self.issues[0].reporter.name).to(equal(name)) |
| 352 | + expect(self.issues[0].reporter.email).to(equal(email)) |
| 353 | + |
| 354 | + def it_should_have_recorded_the_description(self): |
| 355 | + expect(self.issues[0].description).to(equal(desc)) |
| 356 | + |
| 357 | + |
| 358 | +There's not a lot of functionality here, and our issue log has a couple of |
| 359 | +problems, firstly there's no way to see the issues in the log yet, and secondly |
| 360 | +we'll lose all of our data every time we restart the process. We'll fix the |
| 361 | +second of those in the next part |
| 362 | +[https://io.made.com/blog/repository-and-unit-of-work-pattern-in-python/]. |
0 commit comments