Aircraft do not have a singular unique identifier that is time invariant.
While it is true that aircraft have serial numbers issued to their airframe, by itself, aircraft serial numbers are not unique.
The only unique identifier for an aircraft across its lifecycle from production to end of life is a combination of the manufacturer, make and serial number.
I know this because I am on (for better or worse) the patent that involves defining that as a unique identifier for aircraft.
The combination of ICAO aircraft type designator + serial number approximately is the most permanent identifier for an airframe - and even then - if an airframe is modified significantly enough that it no longer is the previous type - even then this identifier can change.
Personally, it boggled my mind that something as big as an aircraft did not have a simple time invariant unique identifier.
P.S. For those who might ask - aircraft registration numbers are like license plates, so they change - tail numbers can be ambiguous and misinterpreted depending on what is painted on the aircraft where, and ICAO 24-bit aircraft addresses are tied to ADS-B transponder boxes, which technically can be moved and reprogrammed between aircraft also.
Go work at a big company. The patent lawyers come around and ask what you've been working on, and a month or two later, your name's on 10 patents, none of which make any sense whatsoever. If you're very lucky you might get a dollar bill for each.
You burrow this simple idea in pages and pages of obfuscated tedium, and that's good enough that everyone is happy. Patent office gets their fee, lawyers get paid, company can say it has a supercharged patented innovation.
I was wondering the same thing. I've had to derive unique identifiers from hundreds of different data sets over the years. What makes it special when it's a plane?
> And the solution is almost always “model, make, and serial number.”
If you've ever spent time in old car forums, you learn that even this isn't enough because of production-line sloppiness.
Serial number re-use is rare, but it happens. Usually because a product had something detected that resulted in remanufacturing, but sometimes other things slip.
I know about systems who had two types of serial numbers which ought to be the same, but weren‘t because they had been programmed at different eol stages, when daylight savings time kicked in. One of the system run in utc the other in local time. Date was part of the serial.
I'm only joking a little. Funny thing, surnames aren't actually that old for Europeans. Most of history there'd be maybe two people with the same name. They solved it back then very much the same way we solve it now.
The full Theseus treatment would need you to take [part of] the airframe that first plane discarded, then recertify it for use under its original serial number.
The way the Aircraft of Theseus is generally resolved is there’s a piece of metal called the “data plate”. This is the airplane as far as the FAA is concerned. I’ve been in a vintage biplane that was completely rebuilt from the data plate up. I think they got it for $40k.
It was worth it because without that, a home built airplane would have an experimental certificate and you couldn’t sell rides in it.
Does the data plate not limit the scope of what can be built around it?
In other words had Virgin Galactic built the VSS Enterprise around the data plate of a Cessna 172, would it then no longer have been an experimental aircraft?
> Personally, it boggled my mind that something as big as an aircraft did not have a simple time invariant unique identifier.
It boggles my mind that despite not having some sort of universal system things work as well as they do.
Aviation grew up relatively insular, and each country that had any sort of aircraft manufacturing did things their own way until fairly recently. Arguably, the first half of the history of aviation is a kind of free-for-all. The fact that we now have a globalized airline industry that mostly follows some kind of standards is the mind-blowing part to me. And I suspect if we weren't mostly down to a dozen or so manufacturers for the vast majority of airliners, even that wouldn't be the case.
Yeah but at some point countries started buying larger planes from only one or two manufacturers. At that point the manufacturers could standardize things.
And many commercial airliners are sold without engines at all.
The operators, such as Delta, do not actually own engines on the aircraft they fly, even though they own the aircraft. The engines are rented from e.g. Pratt & Whitney along with a maintenance contract. That said, that engines are in fact installed at the factory.
Engines are actually changed fairly frequently because they're a wear component on most airplanes. They are also sometimes updated to a newer version or even an entirely different manufacturer. And often it's faster and cheaper to swap in a new engine that's ready to go rather than wait for the one that's attached to be overhauled, so the same engine might see service on multiple airframes.
A lot of these so called "falsehoods" are just design failures on the part of programmers. Someone did it badly first, and it stuck, and a second person came in later and is surprised by the bad design. That's not really interesting, it happens all the time in software. So much so that seasoned engineers have come to expect poor design until proven otherwise.
Things like flight numbers not having reasonable semantics, or conceptual pollution of what a flight is to include multiple take offs and landings are bad design, plain and simple. Just model the problem correctly e.g. maybe a Trip is multiple Flights, or Flights have multiple Legs. This isn't aviation specific. These are generic problems that programmers can and should get right.
Some of it is intrinsic to the domain, like flights not all having gates, or not landing at airports. That was a new tidbit for me.
The claim isn't that programmers go around literally believing falsehoods about a given domain. The whole point of the "falsehoods programmers believe about X" genre is a tongue-in-cheek way of listing the kind of bad design assumptions that happen in a given domain, and I believe that is very interesting indeed.
The fact remains that software that models real-life events or information is making normative assumptions about what can and cannot happen in the domain, due to the very nature of software, and these assumptions are knowingly-or-not being introduced by programmers. If for any given domain we had hundreds of human notaries, scribes or typists managing information instead of software, their mistaken assumptions wouldn't matter—they would simply go "Oh, that's odd", make the necessary adjustments, and learn from the experience. But as long as software is a prescriptive model of what it is representing, it will be valuable to highlight the "falsehoods" that its creators may accidentally prescribe into it.
the point of the article (just as with the one about names) is that there are "reasonable defaults" many people would believe - that don't work in practice and become gotchas
whether you have enough knowledge to know that something is unreasonable doesn't mean it doesn't seem reasonable for many others
And as programmers, we inevitably spend most of our time dealing with these weird edge cases, because the stuff that makes sense is generally incorporated into our initial modelling and becomes a solved problem.
As usual with these lists, they would much benefit from more in-depth explanation. This list at least deigns to link to examples for many of the claims (like a flight that leaves on time but arrives 40 hours late [1]), but doesn't explain what happened.
Having said that, many of the links are very informative. For example the crater on Mars that has an ICAO airport code [2]: "On 19 April 2021, Ingenuity performed the first powered flight on Mars from Jezero, which received the commemorative ICAO airport code JZRO."
This is often for boring reasons - the two week flight was a Google Balloon, the flight was delayed for 40 hours due to bad weather, ADS-B is set by the pilot and many pilots simply set it wrong, and so on.
I develop software for flight data analysis at a company that makes flight data recorders. Our focus is mainly helicopters, but some fixed wing. Dealing with aircraft that may takeoff or land at a base, hospital, roof, parking lot, football field, airport, golf course, etc I feel like most of my days are spent on all sorts of falsehoods about aviation.
Funny how the common thread through many of these 'Falsehoods...' posts is that many programmers think that systems designed by humans, for humans, and kept running by humans will rigidly adhere to a set of rules and don't have edge cases.
Us programmers like to distill everything down to rigid sets of rules because that's how our mind operates. The fewer probabilistic "analog" parameters, the better. Of course the real world doesn't work this way.
It is by no mean specific to programmers. Ask to someone who learns French, for instance. Rules with too many arbitrary exceptions.
What is specific to programmers is that their tool performs at its best with simpler rules, so their job is to find the necessary and sufficient set of rules - and will dismiss most of the cases pointed by this article as unimportant exceptions the software won't handle.
> Ask to someone who learns French, for instance. Rules with too many arbitrary exceptions.
I took French in middle school, and it was always a running joke that the teacher spent the first 5 minutes on the rule, and the next 40 minutes on the exceptions.
In the end the data has to fit into structures or tables that can be processed by some algorithms. If the system is not rigid to a certain degree it would become unmaintainable or full of bugs or both.
Not really. It's just that software by definition must create a model of the domain it attempts to handle. And a model is, in the end, a set of rules. With an absence of rules, the software can't really do anything, as would be pretty pointless for actually solving any problem. The alternative is to hand the users Notepad and say "knock yourselves out".
I'd argue that programmers are indeed much more aware of how many exceptions and edge cases most real world domains have. Ask a lay person about such a simple thing as leap seconds, for instance, and they'll often believe you're making shit up.
The profession of programming is fundamentally about the interface between squishy human systems and rigid rule-based machines. No surprise that keeps coming up.
It is the classic scenario of confusing the map with the territory.
In the map everything is clear. It is clear what a "plane" is what "airports" are and what their relationship is. And transferring that into a computer program is straight forward.
In the territory everything is fuzzy. None of the definitions are without edge cases and the expected relationships are often violated in surprising ways.
Aviation isn't unique here, every system suffers from the distinction between its actual function and the abstract description of that system.
I think it's more, "you might think as a programmer writing software that models the world of aviation that you could assume the following things— but alas."
Software unfortunately follows rigid rules so the challenge is finding a set of rigid rules that can encompass reality. It would be pretty natural if you were writing a database schema that a flight would have a departing airport and an arriving airport— but alas.
Well what I'm speaking to more is that most systems that you model, most of the model is already assumptions. So natural or not, that database schema is already invoking assumptions which may or may not be false. Especially when dealing with any system where humans are directly involved in it. For many things, there's no exhaustive list of rules that will cover all the cases. As they say, if you make something idiot-proof, they'll invent a better idiot.
And this is why I much prefer Suurogate values for primary keys over natural values. And why I've gravitated to using UUID values for surrogates, not integer identities.
A theme running through the article is "this value is unique " and "this value does not change". And of course those are both wrong.
So when designing databases now I assume "everything changes, nothing is unique " (even when the domain "expert" professes it is.)
This approach solves so many problems and saves something time later on when it turns out that that "absolutely, positively, unique for ever" natural key, isn't.
The tradeoff you’re making is performance, sometimes a lot depending on your RDBMS and table size. For smaller tables, under 10,000,000 rows or so, you won’t really notice much, but in the hundreds of millions or billions, you definitely do.
A UUID is at best 2x larger than even a BIGINT, thus the index size is 2x larger. If you aren’t using v1 or v7, it’s also not k-sortable. But most importantly for MySQL (and optionally SQL Server) if the table contains things related to a common entity, like a user’s purchases, the rows are now scattered around the clustering index’s B+tree. That incurs a huge amount of I/O on large tables, and short of a covering index and/or partitioning (which only masks the problem by shrinking the search space), there is no way to improve it. If instead the PK was (user_id, some_other_identifier), all records for a given user are physically co-located.
But it doesn't help much, as the surrogate only lives in your system.
So now some information comes in from outside the system that something happened with a plane, and you still have to find which surrogate id that plane has in your system.
You may decide two things happened to two different planes whereas another system consider it the same plane both times, and vice versa.
The uuid keys make it easy to change some value, but won’t solve the issue of keeping a record of historical changes.
UUID keys PLUS some form of versioning with creation dates will let you change an airport name and let you know what the airport name was on some arbitrary date in the past. Useful for backfills and debugging
You don’t need all that; any candidate key (even natural) with the addition of a datetime would work. What was the definition of Airport X before Datetime Y? And after? Etc.
I always look at these "Falsehoods Programmers Believe..." lists as a source of tests. Each item should spawn a number of unit or integration tests that will help to uproot any of these assumptions that were incorrectly baked into your software.
Each of these did indeed spawn tests. I used to work there and at the time there were over a thousand ranging from humdrum to David Blaine skydiving. They’re a crowd who really put a focus on good engineering
I find this list strange. I have only a passing interest in aviation and I would not believe very many of these.
What made the corresponding lists for names and time interesting were that it was genuinely surprising to realise that their statements were actually false. I don't get that feeling with these.
Like the top level comment about identifiers for airplanes -- why would they have them? That sounds baffling to me. With ownership changes, continuous upgrades, extending airframes, repurposing etc. I would be surprised if there was a stable identity.
My impression is that every single older (pre-2010) computer system that manages the Brazilian aviation felt for that and fixed it in a hack.
> Airports never move
Also, Runways never move. Also, if runways move, they don't change direction. Also, if airport or runways move, there will exist some construction work before.
I'd add "aircraft only land in runways" there too. And "ok, aircraft only land in runways and heliports".
I would assume it's somewhat speaking to the prevalence of many informal landing strips, and also that river landings are probably fairly common too. I'd have to imagine places like Alaska might also have to deal with that, especially if you have small local 'airlines' (which are probably just a handful of bush planes really) that operate from an actual registered airport.
Even the Eastern US has to deal with river and water landings all the time. You can book a scheduled flight from the East River right in-between Manhattan and Brooklyn to Marthas Vineyard, or the Hamptons or a number of other destinations. Not to mention those happen in the middle of arguably the most complex commercial airspace in the world.
It's pretty cool to be on a ferry and see a plane land basically next to you in the middle of the river.
I've written various types of aviation support software on and off since the early 1980s.
One of my favourite planes were the Grumman Mallards still owned and operated by Paspaley Pearling out of Mungalalu Truscott and other Kimberley airbases.
They're classic 1950s twin-engined amphibious aircraft that landed anywhere up and down the Kimberley Coast for pearling transfers.
>The only unique identifier for an aircraft across its lifecycle from production to end of life is a combination of the manufacturer, make and serial number.
>I know this because I am on (for better or worse) the patent that involves defining that as a unique identifier for aircraft.
Isn't that blindingly obvious? If so, how did it get to be a patent? And is someone now extracting rent from it?
Does it matter if something is obvious or not for getting a patent granted? From some casual looks of various US patents, it seems to be "First who writes a obtuse patent about thing X gets it granted", doesn't really matter if the thing is "novel" or not, just that no one tried to submit it before.
Legally it's supposed to matter, yes. Non-novel or obvious ideas are according to the law not eligible to be patented. In practice the mechanism to decide both of these is broken.
I see, that's really not visible in practice. Silly example perhaps, but US5443036A comes to mind which just shows how broken the system is:
> A method for inducing cats to exercise consists of directing a beam of invisible light produced by a hand-held laser apparatus onto the floor or wall or other opaque surface in the vicinity of the cat, then moving the laser so as to cause the bright pattern of light to move in an irregular way fascinating to cats, and to any other animal with a chase instinct.
How on earth is anyone supposed to be able to take the patents system as a whole when there are 100s (if not 1000s) of examples like that, which obviously shouldn't be approved if "novel" or "non-obvious" ideas are required.
The US patent system seems profoundly broken. Given that the patent system seems much less broken in other developed countries and the vast wealth and resources of the US, I assume it is broken on purpose?
It's not just the technology, it's the employment of it too. In 1993 this was a new way to use lasers, which a decade before were too expensive, delicate and power hungry to use as such.
Put another way, the change can be incremental. Building upon what is. Without this, pretty much all incremental science would lose funding, for the moment you invent, regardless of cost, it'd just be copied.
If you've ever done hardware, even a toy, it's not simple.
Extensive prototypes, testing for drops, hand fit, assembly at the factory, and more.
Devs today can't even conceive of making a 100% stable product to be shipped on floppy and never updated. Reshipping for bugfixes could break a company in the old days.
Now try that with hardware!
And all those tweaks, fixes, tests can be copied in a second without patents.
I think separating software and hardware patent discussions would be better here, because hardware patents are requied.
> In 1993 this was a new way to use lasers, which a decade before were too expensive, delicate and power hungry to use as such.
I think your timescale is slightly off, but I don't know enough about laser history to say definitely. But judging by what I could find, in 1981 Popular Science seems to have run an ad for laser pointer devices, aimed (no pun intended) towards consumers:
> It wasn’t until the 1980s that lasers became small enough, and required so little energy, that they finally became cheap enough to be used in consumer electronics — take this funky laser pointer from the early 1980s, for example. The November 1981 edition of Popular Science features a Lasers Unlimited advertisement for an assortment of laser pointing devices, including a ruby laser ray gun, a visible red laser lightgun, multi-color lasers and laser light shows, all of which were selling for less than $15 (equivalent to about $42 today) - https://melmagazine.com/en-us/story/a-dazzling-history-of-th...
So if they became usable but consumers in 1980s, I'm about 99% confident at least one individual used it for playing with their cats.
But since the author of the patent just happened to have spent the time (10 years later) to write the patent, they got it awarded to them.
Here's one that is only kind of mentioned, there are actually different altitudes. If you use ADS-B data you will only get the barometric altitude which is not calibrated to the ground pressure level. For example if you watch ADS-B data of flights into Denver it appears that every aircraft is crashing down ~5000ft during landing.
Back when I was designing an app for air navigation, I came up with an alternative color scheme for various types of color blindness only to be told the target users were not allowed to be color blind (it was in France, much stricter than elsewhere it seems).
Well, as a senior software engineer and commercial pilot ... I am left confused.
Not all the things in the list, because I am aware of those. I might have missed the runway numbers changing based on shifting magnetic field of the earth, but that's a thing too. Runway 22? That's now Runway 21.
But why programmers specifically would believe this, as opposed to ... any other profession that is not aviation?
There is a genre of articles that list similar non-intuitive facts about various domains (people's names, music, etc). The relation to programmers is that they are often creating software systems where some of these facts come into play, e.g., by using some values as primary keys, foreign keys, etc.
The article isn't meant to imply that only programmers would believe these. It's just a little niche of 'Falsehoods that Programmers believe about XYZ' sort of articles that became popular because programmers tend to write software that ends up interacting with real world systems that have edge cases many programmers would not consider if they're not dealing with the problem space for a while.
They can legally land in any navigable body of water (some restrictions in some areas for national preserves) but some lakes have defined water runways.
>If an aircraft diverts to another destination, it won’t divert again.
Hehe, I was once told we couldn't land at our destination A, so we got diverted to B; while on our way to B we were told we are actually going to C; and, while on our way to C, A became available again so the plane did a U-turn and we flew back to A, landing with a ~3 hour delay.
Haha, nice! My head as a programmer explodes while reading this list, because I feel like these are all reasonable assumptions and I feel how they are painfully discovered late into the implementation.
Also, feeling myself stupid very quickly. Very nice summary, bravo!
Honestly I am surprised by some of the points. But after reading all of it, now I am wondering as an outsider, what the hell is a "flight" if there's basically no good abstraction for this mess? What does it mean when a new flight is created, or what does the existence of any single flight mean?
Flight is a concept which is at least a body moving without support to an underlying surface. Everything else are human plugins added to help us exploit the concept in various circumstances. Any additional constraints on the concept are valid only in the circumstances they were invented for. Enumeration of the circumstances should take into account the participants of the communication context where the word "flight" is used.
I had known about some of these, and I had thought that some others are at least possible.
I know that there is a ICAO code on Mars (since I had read about it before).
I think there are some airports that have a ICAO code but not IATA code and vice-versa, and some have a "pseudo-ICAO" code with letters and numbers together.
Perhaps useful to produce a list of true constraints in contrast to false ones. Perhaps that would result in too many “except for”, “apart from” and “subject to” statements.
On the ADS-B receiver side, I'd add "Each ADS-B packet will be clearly heard by the on-ground receiver, there will be no other radio station sending an ADS-B packet when another station is actively transmitting" and "Only actual ADS-B stations use the 1090 MHz frequency, no one will attempt to maliciously jam the entire band".
> Sounds like a list of edge cases just like any other area.
That's exactly the point. The famous example (Falsehoods Programmers Believe About Names) has examples I have encountered in medical databases. If a programmer somewhere didn't fall into the trap, patient names in a medical database would have been better managed and may have avoided duplication, lost records, etc.
to the best of my recollection, the only way to tell a ship from a boat is to watch it make a "high" speed turn, ships lean out, boats lean in. But this is probably incorrect, just like all of my education was.
Its like receiving some API documentation that confidently declares some field as an ENUM and then a few hundred million rows later you discover that that was more like a suggestion and its actually more like a free text field.. sigh
Now you have to specify whether or not it’s moved during queries (and what if it moves again?) There’s probably a more elegant way I’m not thinking of, but standard created_at and updated_at fields would work: if a given date is <= the move date, it’s the original airport, else the new one. Rinse and repeat if it moves again.
https://www.airnavradar.com/data/airlines/tmw is a good example of some of these (depending on what time you check that link -- if it's night-time in the Maldives it's going to show you nothing)
1. I'll never need to learn a falsehood list, so I can skip it.
2. A falsehood list is complete at the time of writing.
3. OK, but it will surely get updated with new falsehoods and clarifications.
4. Skimming the falsehood list is all I need to do to learn it.
5. OK, but surely I'll remember to recheck the falsehood list once I actually need to, right?
6. If a falsehood doesn't immediately make sense to me, there must be something wrong with it, despite the author having domain expertise that I don't.
Literally had to point out just last night how UTC is not sufficient in all scenarios. I swear it happens every 6 mos on Reddit.
"Falsehoods "falsehoods programmers believe about falsehoods" blog posters believe about "falsehoods programmers believe about "falsehoods programmers believe about logic" blog" falsehoods"
Day by day it feels less and less like regular data modeling and more like a debate with Jordan Peterson where you argue for ten hours what a "name" is.
Eventually you end up having to make choices and deal with the consequences. Otherwise Jordan Peterson would have you chasing your tail for days about what a "choice" is, and nothing would ever get done.
tl;dr: just make your best guess and always include an extra "notes" column where things can get leaky.
Not days necessarily, but I think quite a bit of time should be spent data modeling, yes. Before you’ve ever touched the keyboard, it’s very helpful to attempt to model the problem on paper or a whiteboard. You quickly find problems with your initial guess that way.
Notes / data / extra et. al columns are the worst, as a DBRE. People inevitably shove various shit into them over time instead of making an effort to properly fix past mistakes, and at some point, they practically contain their own table.
Bit of a rant: what annoys me about these lists is how they just give off a huge "you are dumb for making any assumptions, how could you not think of <extremely obscure edge case>" vibe. I'd be interested to see what the effects are of these assumptions failing, because often they are pretty reasonable assumptions for a reasonable subset of the universe. Software is imperfect and you can't cover every possibility. Like ok technically 10 flights with the same number could leave the same gate at the same time, but if 99.99% of the time they don't and you assume that, what is the real impact to people?
Reminds me of a list that came up ages ago that presented an assumption of "X code always runs" with the counterpoint that you could unplug the computer. Ok sure, but then why write software at all? Clearly no point assuming any code will ever run since you can just terminate the program at any random time.
I don't agree that this list has the attitude you describe--if anything, they just seem proud that they have many fewer of these corner case bugs than anyone else--so it is difficult to work with your example of the flight number. These are, in fact, misconceptions made by programmers, often without having the in-depth knowledge of this specific area that comes from being an actual expert (the kind that often people don't allocate for in their budgets), and this list isn't an over-the-top portrayal of such: it feels weird to become offended?
That said, I do appreciate some of these lists--which maybe has put you on edge to the paradigm--do have an edge to them... but, in all honesty, I think they should? The bugs and edge cases that these lists tend to expose aren't random glitches that equally affect every user: they usually segment users into the ones whose lives "follow the happy path" (which often just means "are intuitive and familiar to the culture near the developer") and the users who get disproportionately (or even continually!) screwed every time they dare interact with a computer.
And like, it is actually a problem that the other side of this is almost always a developer who doesn't really give a shit and considers that user's (or even an entire region/country's) existence to somehow be a negligible statistic not worth their time or energy, and I really do think that they deserve to take some flak for that (the same way I try to not get offended if someone points out how my being a cis-het white male blinds me to stuff: I think I deserve to get held to task harder by frustrated minorities rather than force them to be nice all the time in a world that penalizes them).
I don't disagree with you at all. My point was more like what another commenter said, that software adheres to a strict and very finite set of rules, the real world is way more complicated than that. It's so trivially easy to find real world counterexamples to just about any software that it's a barely interesting exercise (IMO). So you define a reasonable subset and work with that. And the reasonable subset is probably defined by positive/negative outcomes.
It would have been cool if the blog post discussed those outcomes so we can reason about it properly, otherwise it's just a list of claims at face value. If the programmer making an assumption means a screen at a gate says the wrong boarding time when there's a human there controlling the boarding, then not the end of the world. But if the programmer making an assumption causes 1/10000 flights to crash, then that's interesting and worthwhile calling out. It's just endless speculation without a proper outcome to tie it down.
At a general level I think these lists make developers more aware of uniqueness and constraints.
When designing data I think these questions (skepticisms) should be front of mind;
1) natural values are not unique.
2) things identified by number are best stored as a string. If you're not going to do math on it, it's not a number. That "customer number" should be treated as "customer id" and as a string.
3) be careful constraining data. Those "helpful checks" to make sure the "zip code is valid" are harmful not helpful.
4) those tiny edge cases may "almost never happen" but they will end up consuming your support department. Challenge your own assumptions at every possible opportunity. Never assume anything you "know" is true.
It's hard to measure time saved, and problems avoided, with good design. But it's easy to see bad design as it plays out over decades.
And (especially today) never optimize design for "size". Y2K showed that folly once and for all.
This implies denormalization, which is rarely needed for performance, despite what so many believe. Now you’ve introduced referential integrity issues, and have taken a huge performance hit at scale.
> 3)
I mean, maybe don’t try to use a regex on an email address beyond “is there a local and domain portion,” but a ZIP code, as in U.S. only, seems pretty straightforward to check. I would much rather have to update a check constraint if proven wrong than to risk bad data the rest of the time.
> never optimize for size
Optimize for size when it doesn’t introduce other issues. Anyone working on 2-digit years could have and likely did see that issue, but opted to ignore it for various reasons (“not my problem,” etc.). But for example, _especially_ since Postgres has a native type for IP addresses, there is zero reason to store them as strings in dotted quad. Even if you have MySQL, store them as a UINT32, and use its built-in functions to cast back and forth.
>It's so trivially easy to find real world counterexamples to just about any software that it's a barely interesting exercise (IMO).
These lists hopefully make programmers aware that a lot of their assumptions about the real world might be wrong, or at least questionable.
Examples are assumptions on the local part of email addresses without checking the appropriate RFCs. Which then get enshrined in e.g. JavaScript libraries which everyone copies. I've been annoyed for the last 30 years by websites where the local part is expected to be composed of only [a-z0-9_-] although the plus sign (and many other characters) are valid constituents of a local part.
Or assumptions on telephone numbers. Including various ways (depending on local culture) of structuring their notation, e.g. "123 456 789" versus "12-3456-89" where software is too dumb to just ignore spaces or dashes, or even a stray whitespace character copied by accident with the mouse.
And those forms where you have to enter a credit card (or bank account number) in fields of n characters each, which makes cut/copy/paste difficult because you notes contain it in the "wrong" format.
So while some examples may count as "just usability" it all stemps from naive assumptions by programmers who think one size fits all (it doesn't).
I disagree, in my view they do not inherently give off such vibes at all. In this post for example, they specifically broach the topic like so:
> There are a lot of assumptions one could make when designing data types and schemas for aviation data that turn out to be inaccurate.
Sounds like a pretty explicit acknowledgement of the notion that these are otherwise reasonable assumptions that just happen to fail when put to the test, I'd say.
It's very easy to self-deprecate, especially if one has insecurities. But that doesn't mean that articles like this actually mean to do so. I think it's worthwhile for everyone involved to always evaluate whether the feeling is actually coming from the source you're looking at, or if that source just happened to trigger it inside you. More often than not, in my anecdotal experience, it's the latter.
I'd also find it interesting to learn what happens when these falsehoods nonetheless make it into an implementation though.
> I'd be interested to see what the effects are of these assumptions failing
Mostly confusion, but the combination of aviation and confusion can be dangerous and even deadly. Not directly related to this list, but I'm reminded of [1]: no one entity has set out to inconvenience the hapless traveler, but the combination of history and practice are a constant source of irritation, and at the times of heightened tensions and security might even lead to scary incidents. All because of the name.
Here's a fun true story..
Aircraft do not have a singular unique identifier that is time invariant.
While it is true that aircraft have serial numbers issued to their airframe, by itself, aircraft serial numbers are not unique.
The only unique identifier for an aircraft across its lifecycle from production to end of life is a combination of the manufacturer, make and serial number.
I know this because I am on (for better or worse) the patent that involves defining that as a unique identifier for aircraft.
The combination of ICAO aircraft type designator + serial number approximately is the most permanent identifier for an airframe - and even then - if an airframe is modified significantly enough that it no longer is the previous type - even then this identifier can change.
Personally, it boggled my mind that something as big as an aircraft did not have a simple time invariant unique identifier.
P.S. For those who might ask - aircraft registration numbers are like license plates, so they change - tail numbers can be ambiguous and misinterpreted depending on what is painted on the aircraft where, and ICAO 24-bit aircraft addresses are tied to ADS-B transponder boxes, which technically can be moved and reprogrammed between aircraft also.
> combination of the manufacturer, make and serial number.
> patent that involves defining that as a unique identifier for aircraft.
Now i got mighty curious what makes this novel enough to be a patent.
Go work at a big company. The patent lawyers come around and ask what you've been working on, and a month or two later, your name's on 10 patents, none of which make any sense whatsoever. If you're very lucky you might get a dollar bill for each.
You burrow this simple idea in pages and pages of obfuscated tedium, and that's good enough that everyone is happy. Patent office gets their fee, lawyers get paid, company can say it has a supercharged patented innovation.
It's a unique identifier but now "not on a computer".
I was wondering the same thing. I've had to derive unique identifiers from hundreds of different data sets over the years. What makes it special when it's a plane?
Maybe its right-click level of patent.
Reminds me of the ship of Theseus: https://en.wikipedia.org/wiki/Ship_of_Theseus
The “real” unique identifier is such a common problem. And the solution is almost always “model, make, and serial number.”
Plus year of production if necessary.
I’ve seen programmers attempt deduplicate humans by language spoken.
> And the solution is almost always “model, make, and serial number.”
If you've ever spent time in old car forums, you learn that even this isn't enough because of production-line sloppiness.
Serial number re-use is rare, but it happens. Usually because a product had something detected that resulted in remanufacturing, but sometimes other things slip.
I know about systems who had two types of serial numbers which ought to be the same, but weren‘t because they had been programmed at different eol stages, when daylight savings time kicked in. One of the system run in utc the other in local time. Date was part of the serial.
> I’ve seen programmers attempt deduplicate humans by language spoken.
How is that supposed to help? If two people have the same name, it's overwhelmingly likely that they also speak the same language.
How does "model, make and serial number" translate to humans?
(No racist intentions here, but you bring up both points and I thought that to be interesting)
Johnson Smith
The son of John who is a smith
I'm only joking a little. Funny thing, surnames aren't actually that old for Europeans. Most of history there'd be maybe two people with the same name. They solved it back then very much the same way we solve it now.
https://en.wikipedia.org/wiki/Category:Occupational_surnames
https://en.wikipedia.org/wiki/Patronymic
Even runway IDs change over time: https://en.m.wikipedia.org/wiki/Runway#:~:text=Runway%20desi...
Aircraft of Theseus
Exactly!
The full Theseus treatment would need you to take [part of] the airframe that first plane discarded, then recertify it for use under its original serial number.
Is that allowed?
The way the Aircraft of Theseus is generally resolved is there’s a piece of metal called the “data plate”. This is the airplane as far as the FAA is concerned. I’ve been in a vintage biplane that was completely rebuilt from the data plate up. I think they got it for $40k.
It was worth it because without that, a home built airplane would have an experimental certificate and you couldn’t sell rides in it.
Does the data plate not limit the scope of what can be built around it?
In other words had Virgin Galactic built the VSS Enterprise around the data plate of a Cessna 172, would it then no longer have been an experimental aircraft?
But does the data plate have an ID?
Yes, but it’s not necessarily unique like a VIN.
> Personally, it boggled my mind that something as big as an aircraft did not have a simple time invariant unique identifier.
It boggles my mind that despite not having some sort of universal system things work as well as they do.
Aviation grew up relatively insular, and each country that had any sort of aircraft manufacturing did things their own way until fairly recently. Arguably, the first half of the history of aviation is a kind of free-for-all. The fact that we now have a globalized airline industry that mostly follows some kind of standards is the mind-blowing part to me. And I suspect if we weren't mostly down to a dozen or so manufacturers for the vast majority of airliners, even that wouldn't be the case.
Yeah but at some point countries started buying larger planes from only one or two manufacturers. At that point the manufacturers could standardize things.
> combination of the manufacturer, make and serial number
What if a new aircraft were made 50/50 from the parts of two older aircraft
It would depend on which side got the data plate.
I would also say engine too.
Engines are treated as their own thing separate to the airframe, with their own serial numbers.
And many commercial airliners are sold without engines at all.
The operators, such as Delta, do not actually own engines on the aircraft they fly, even though they own the aircraft. The engines are rented from e.g. Pratt & Whitney along with a maintenance contract. That said, that engines are in fact installed at the factory.
Engines are actually changed fairly frequently because they're a wear component on most airplanes. They are also sometimes updated to a newer version or even an entirely different manufacturer. And often it's faster and cheaper to swap in a new engine that's ready to go rather than wait for the one that's attached to be overhauled, so the same engine might see service on multiple airframes.
A lot of these so called "falsehoods" are just design failures on the part of programmers. Someone did it badly first, and it stuck, and a second person came in later and is surprised by the bad design. That's not really interesting, it happens all the time in software. So much so that seasoned engineers have come to expect poor design until proven otherwise.
Things like flight numbers not having reasonable semantics, or conceptual pollution of what a flight is to include multiple take offs and landings are bad design, plain and simple. Just model the problem correctly e.g. maybe a Trip is multiple Flights, or Flights have multiple Legs. This isn't aviation specific. These are generic problems that programmers can and should get right.
Some of it is intrinsic to the domain, like flights not all having gates, or not landing at airports. That was a new tidbit for me.
The claim isn't that programmers go around literally believing falsehoods about a given domain. The whole point of the "falsehoods programmers believe about X" genre is a tongue-in-cheek way of listing the kind of bad design assumptions that happen in a given domain, and I believe that is very interesting indeed.
The fact remains that software that models real-life events or information is making normative assumptions about what can and cannot happen in the domain, due to the very nature of software, and these assumptions are knowingly-or-not being introduced by programmers. If for any given domain we had hundreds of human notaries, scribes or typists managing information instead of software, their mistaken assumptions wouldn't matter—they would simply go "Oh, that's odd", make the necessary adjustments, and learn from the experience. But as long as software is a prescriptive model of what it is representing, it will be valuable to highlight the "falsehoods" that its creators may accidentally prescribe into it.
I don't understand your argument
it doesn't matter whose failure it is
the point of the article (just as with the one about names) is that there are "reasonable defaults" many people would believe - that don't work in practice and become gotchas
whether you have enough knowledge to know that something is unreasonable doesn't mean it doesn't seem reasonable for many others
And as programmers, we inevitably spend most of our time dealing with these weird edge cases, because the stuff that makes sense is generally incorporated into our initial modelling and becomes a solved problem.
As usual with these lists, they would much benefit from more in-depth explanation. This list at least deigns to link to examples for many of the claims (like a flight that leaves on time but arrives 40 hours late [1]), but doesn't explain what happened.
Having said that, many of the links are very informative. For example the crater on Mars that has an ICAO airport code [2]: "On 19 April 2021, Ingenuity performed the first powered flight on Mars from Jezero, which received the commemorative ICAO airport code JZRO."
[1] https://www.flightaware.com/live/flight/PDT5965/history/2025...
[2] https://en.wikipedia.org/wiki/Jezero_(crater)
This is often for boring reasons - the two week flight was a Google Balloon, the flight was delayed for 40 hours due to bad weather, ADS-B is set by the pilot and many pilots simply set it wrong, and so on.
I develop software for flight data analysis at a company that makes flight data recorders. Our focus is mainly helicopters, but some fixed wing. Dealing with aircraft that may takeoff or land at a base, hospital, roof, parking lot, football field, airport, golf course, etc I feel like most of my days are spent on all sorts of falsehoods about aviation.
Adding to the list: Runway numbers never change [1]
[1] https://www.ncei.noaa.gov/news/airport-runway-names-shift-ma...
Funny how the common thread through many of these 'Falsehoods...' posts is that many programmers think that systems designed by humans, for humans, and kept running by humans will rigidly adhere to a set of rules and don't have edge cases.
Us programmers like to distill everything down to rigid sets of rules because that's how our mind operates. The fewer probabilistic "analog" parameters, the better. Of course the real world doesn't work this way.
> because that's how our mind operates
It is by no mean specific to programmers. Ask to someone who learns French, for instance. Rules with too many arbitrary exceptions.
What is specific to programmers is that their tool performs at its best with simpler rules, so their job is to find the necessary and sufficient set of rules - and will dismiss most of the cases pointed by this article as unimportant exceptions the software won't handle.
> Ask to someone who learns French, for instance. Rules with too many arbitrary exceptions.
I took French in middle school, and it was always a running joke that the teacher spent the first 5 minutes on the rule, and the next 40 minutes on the exceptions.
That is still a good rule if its just 40 minutes of exceptions if it covers much more than that.
In the end the data has to fit into structures or tables that can be processed by some algorithms. If the system is not rigid to a certain degree it would become unmaintainable or full of bugs or both.
Not really. It's just that software by definition must create a model of the domain it attempts to handle. And a model is, in the end, a set of rules. With an absence of rules, the software can't really do anything, as would be pretty pointless for actually solving any problem. The alternative is to hand the users Notepad and say "knock yourselves out".
I'd argue that programmers are indeed much more aware of how many exceptions and edge cases most real world domains have. Ask a lay person about such a simple thing as leap seconds, for instance, and they'll often believe you're making shit up.
The profession of programming is fundamentally about the interface between squishy human systems and rigid rule-based machines. No surprise that keeps coming up.
It is the classic scenario of confusing the map with the territory.
In the map everything is clear. It is clear what a "plane" is what "airports" are and what their relationship is. And transferring that into a computer program is straight forward.
In the territory everything is fuzzy. None of the definitions are without edge cases and the expected relationships are often violated in surprising ways.
Aviation isn't unique here, every system suffers from the distinction between its actual function and the abstract description of that system.
I think it's more, "you might think as a programmer writing software that models the world of aviation that you could assume the following things— but alas."
Software unfortunately follows rigid rules so the challenge is finding a set of rigid rules that can encompass reality. It would be pretty natural if you were writing a database schema that a flight would have a departing airport and an arriving airport— but alas.
Well what I'm speaking to more is that most systems that you model, most of the model is already assumptions. So natural or not, that database schema is already invoking assumptions which may or may not be false. Especially when dealing with any system where humans are directly involved in it. For many things, there's no exhaustive list of rules that will cover all the cases. As they say, if you make something idiot-proof, they'll invent a better idiot.
And this is why I much prefer Suurogate values for primary keys over natural values. And why I've gravitated to using UUID values for surrogates, not integer identities.
A theme running through the article is "this value is unique " and "this value does not change". And of course those are both wrong.
So when designing databases now I assume "everything changes, nothing is unique " (even when the domain "expert" professes it is.)
This approach solves so many problems and saves something time later on when it turns out that that "absolutely, positively, unique for ever" natural key, isn't.
The tradeoff you’re making is performance, sometimes a lot depending on your RDBMS and table size. For smaller tables, under 10,000,000 rows or so, you won’t really notice much, but in the hundreds of millions or billions, you definitely do.
A UUID is at best 2x larger than even a BIGINT, thus the index size is 2x larger. If you aren’t using v1 or v7, it’s also not k-sortable. But most importantly for MySQL (and optionally SQL Server) if the table contains things related to a common entity, like a user’s purchases, the rows are now scattered around the clustering index’s B+tree. That incurs a huge amount of I/O on large tables, and short of a covering index and/or partitioning (which only masks the problem by shrinking the search space), there is no way to improve it. If instead the PK was (user_id, some_other_identifier), all records for a given user are physically co-located.
But it doesn't help much, as the surrogate only lives in your system.
So now some information comes in from outside the system that something happened with a plane, and you still have to find which surrogate id that plane has in your system.
You may decide two things happened to two different planes whereas another system consider it the same plane both times, and vice versa.
The uuid keys make it easy to change some value, but won’t solve the issue of keeping a record of historical changes.
UUID keys PLUS some form of versioning with creation dates will let you change an airport name and let you know what the airport name was on some arbitrary date in the past. Useful for backfills and debugging
You don’t need all that; any candidate key (even natural) with the addition of a datetime would work. What was the definition of Airport X before Datetime Y? And after? Etc.
But if your natural key is the thing that changed you’d never know that airport x was renamed to airport y. You’d just have two different keys.
Falsehoods people believe about programmers:
* Programmers believe they are handling all possible configurations of the universe when putting something into production.
* Programmers don't handle all possible configurations of the universe when putting code into production because they don't know any better.
Falsehoods people believe about the universe:
* There exists a constant.
* SI units are constant at all times or everywhere.
Falsehoods people believe about programs:
* When a new corner case appears, it is easy to adjust the program to handle it.
Falsehoods people believe about falsehoods about programmers: the stated falsehoods are actually false.
I always look at these "Falsehoods Programmers Believe..." lists as a source of tests. Each item should spawn a number of unit or integration tests that will help to uproot any of these assumptions that were incorrectly baked into your software.
Each of these did indeed spawn tests. I used to work there and at the time there were over a thousand ranging from humdrum to David Blaine skydiving. They’re a crowd who really put a focus on good engineering
Uhh, and how often were you able to run the Blaine skydiving test? :D
I find this list strange. I have only a passing interest in aviation and I would not believe very many of these.
What made the corresponding lists for names and time interesting were that it was genuinely surprising to realise that their statements were actually false. I don't get that feeling with these.
Like the top level comment about identifiers for airplanes -- why would they have them? That sounds baffling to me. With ownership changes, continuous upgrades, extending airframes, repurposing etc. I would be surprised if there was a stable identity.
Yeah. This list is more like "model simplifications you should think twice about making".
Yes, this list is insulting to programmers.
Great Summary of how messed up Airport Codes are by CGP Grey
https://www.youtube.com/watch?v=jfOUVYQnuhw
including (attempts at) a few in-depth reasons for why these quirks exists
> Flights take off and land at airports
My impression is that every single older (pre-2010) computer system that manages the Brazilian aviation felt for that and fixed it in a hack.
> Airports never move
Also, Runways never move. Also, if runways move, they don't change direction. Also, if airport or runways move, there will exist some construction work before.
I'd add "aircraft only land in runways" there too. And "ok, aircraft only land in runways and heliports".
> My impression is that every single older (pre-2010) computer system that manages the Brazilian aviation felt for that and fixed it in a hack.
Can you elaborate more?
I would assume it's somewhat speaking to the prevalence of many informal landing strips, and also that river landings are probably fairly common too. I'd have to imagine places like Alaska might also have to deal with that, especially if you have small local 'airlines' (which are probably just a handful of bush planes really) that operate from an actual registered airport.
Even the Eastern US has to deal with river and water landings all the time. You can book a scheduled flight from the East River right in-between Manhattan and Brooklyn to Marthas Vineyard, or the Hamptons or a number of other destinations. Not to mention those happen in the middle of arguably the most complex commercial airspace in the world.
It's pretty cool to be on a ferry and see a plane land basically next to you in the middle of the river.
I've written various types of aviation support software on and off since the early 1980s.
One of my favourite planes were the Grumman Mallards still owned and operated by Paspaley Pearling out of Mungalalu Truscott and other Kimberley airbases.
They're classic 1950s twin-engined amphibious aircraft that landed anywhere up and down the Kimberley Coast for pearling transfers.
* https://en.wikipedia.org/wiki/Grumman_G-73_Mallard
* https://en.wikipedia.org/wiki/Mungalalu_Truscott_Airbase
A reference to Air France 447 perhaps?
>The only unique identifier for an aircraft across its lifecycle from production to end of life is a combination of the manufacturer, make and serial number. >I know this because I am on (for better or worse) the patent that involves defining that as a unique identifier for aircraft.
Isn't that blindingly obvious? If so, how did it get to be a patent? And is someone now extracting rent from it?
Does it matter if something is obvious or not for getting a patent granted? From some casual looks of various US patents, it seems to be "First who writes a obtuse patent about thing X gets it granted", doesn't really matter if the thing is "novel" or not, just that no one tried to submit it before.
Legally it's supposed to matter, yes. Non-novel or obvious ideas are according to the law not eligible to be patented. In practice the mechanism to decide both of these is broken.
I see, that's really not visible in practice. Silly example perhaps, but US5443036A comes to mind which just shows how broken the system is:
> A method for inducing cats to exercise consists of directing a beam of invisible light produced by a hand-held laser apparatus onto the floor or wall or other opaque surface in the vicinity of the cat, then moving the laser so as to cause the bright pattern of light to move in an irregular way fascinating to cats, and to any other animal with a chase instinct.
How on earth is anyone supposed to be able to take the patents system as a whole when there are 100s (if not 1000s) of examples like that, which obviously shouldn't be approved if "novel" or "non-obvious" ideas are required.
There is also a granted patent for throwing a stick for a dog:
https://patents.google.com/patent/US6360693B1/en
The US patent system seems profoundly broken. Given that the patent system seems much less broken in other developed countries and the vast wealth and resources of the US, I assume it is broken on purpose?
That one is kind of amazing - I wonder if it was intended as a self-parody of the patent system? But in general yes I agree.
What's not new about it?
It's not just the technology, it's the employment of it too. In 1993 this was a new way to use lasers, which a decade before were too expensive, delicate and power hungry to use as such.
Put another way, the change can be incremental. Building upon what is. Without this, pretty much all incremental science would lose funding, for the moment you invent, regardless of cost, it'd just be copied.
If you've ever done hardware, even a toy, it's not simple.
Extensive prototypes, testing for drops, hand fit, assembly at the factory, and more.
Devs today can't even conceive of making a 100% stable product to be shipped on floppy and never updated. Reshipping for bugfixes could break a company in the old days.
Now try that with hardware!
And all those tweaks, fixes, tests can be copied in a second without patents.
I think separating software and hardware patent discussions would be better here, because hardware patents are requied.
> In 1993 this was a new way to use lasers, which a decade before were too expensive, delicate and power hungry to use as such.
I think your timescale is slightly off, but I don't know enough about laser history to say definitely. But judging by what I could find, in 1981 Popular Science seems to have run an ad for laser pointer devices, aimed (no pun intended) towards consumers:
> It wasn’t until the 1980s that lasers became small enough, and required so little energy, that they finally became cheap enough to be used in consumer electronics — take this funky laser pointer from the early 1980s, for example. The November 1981 edition of Popular Science features a Lasers Unlimited advertisement for an assortment of laser pointing devices, including a ruby laser ray gun, a visible red laser lightgun, multi-color lasers and laser light shows, all of which were selling for less than $15 (equivalent to about $42 today) - https://melmagazine.com/en-us/story/a-dazzling-history-of-th...
So if they became usable but consumers in 1980s, I'm about 99% confident at least one individual used it for playing with their cats.
But since the author of the patent just happened to have spent the time (10 years later) to write the patent, they got it awarded to them.
Another discussion of this article, but on a FlightAware forum:
https://www.flightaware.com/squawks/view/1/7_days/popular_ne...
Here's one that is only kind of mentioned, there are actually different altitudes. If you use ADS-B data you will only get the barometric altitude which is not calibrated to the ground pressure level. For example if you watch ADS-B data of flights into Denver it appears that every aircraft is crashing down ~5000ft during landing.
Back when I was designing an app for air navigation, I came up with an alternative color scheme for various types of color blindness only to be told the target users were not allowed to be color blind (it was in France, much stricter than elsewhere it seems).
Well, as a senior software engineer and commercial pilot ... I am left confused.
Not all the things in the list, because I am aware of those. I might have missed the runway numbers changing based on shifting magnetic field of the earth, but that's a thing too. Runway 22? That's now Runway 21.
But why programmers specifically would believe this, as opposed to ... any other profession that is not aviation?
There is a genre of articles that list similar non-intuitive facts about various domains (people's names, music, etc). The relation to programmers is that they are often creating software systems where some of these facts come into play, e.g., by using some values as primary keys, foreign keys, etc.
The article isn't meant to imply that only programmers would believe these. It's just a little niche of 'Falsehoods that Programmers believe about XYZ' sort of articles that became popular because programmers tend to write software that ends up interacting with real world systems that have edge cases many programmers would not consider if they're not dealing with the problem space for a while.
Programmers learning that education is a thing
The author is a developer at FlightAware, theyre just showcasing the difficulty of writing software for aviation
> But why programmers specifically would believe this, as opposed to ... any other profession that is not aviation?
I don't read it as programmers specifically believing that, is that they're specifically treating these things as invariants in their projects.
Dont forget float planes can land to a pretty massive number of docks
They can legally land in any navigable body of water (some restrictions in some areas for national preserves) but some lakes have defined water runways.
Falsehoods managers force programmers to believe about X
>If an aircraft diverts to another destination, it won’t divert again.
Hehe, I was once told we couldn't land at our destination A, so we got diverted to B; while on our way to B we were told we are actually going to C; and, while on our way to C, A became available again so the plane did a U-turn and we flew back to A, landing with a ~3 hour delay.
The cause was snow and wind.
Haha, nice! My head as a programmer explodes while reading this list, because I feel like these are all reasonable assumptions and I feel how they are painfully discovered late into the implementation.
Also, feeling myself stupid very quickly. Very nice summary, bravo!
Honestly I am surprised by some of the points. But after reading all of it, now I am wondering as an outsider, what the hell is a "flight" if there's basically no good abstraction for this mess? What does it mean when a new flight is created, or what does the existence of any single flight mean?
It means what it needs to mean so everyone involved will know what it means in the moment
Flight is a concept which is at least a body moving without support to an underlying surface. Everything else are human plugins added to help us exploit the concept in various circumstances. Any additional constraints on the concept are valid only in the circumstances they were invented for. Enumeration of the circumstances should take into account the participants of the communication context where the word "flight" is used.
I had known about some of these, and I had thought that some others are at least possible.
I know that there is a ICAO code on Mars (since I had read about it before).
I think there are some airports that have a ICAO code but not IATA code and vice-versa, and some have a "pseudo-ICAO" code with letters and numbers together.
Perhaps useful to produce a list of true constraints in contrast to false ones. Perhaps that would result in too many “except for”, “apart from” and “subject to” statements.
Aside: is there a notation for such constraints?
RFC 2119 and be very exact with your language? don't use "MUST" if there's exceptions, for instance.
On the ADS-B receiver side, I'd add "Each ADS-B packet will be clearly heard by the on-ground receiver, there will be no other radio station sending an ADS-B packet when another station is actively transmitting" and "Only actual ADS-B stations use the 1090 MHz frequency, no one will attempt to maliciously jam the entire band".
Sounds like a list of edge cases just like any other area.
Myths programers believe about cars:
Cars in the same lane always travel in the same direction.
Each street has a name.
Each street has a unique name.
Each street has only one name.
Cars have four wheels.
Cars never move vertically.
Roads never move.
Roads never cross water without bridges.
When two roads cross, the do so at an intersection.
Take any field in human experience and one can make such a list.
All boats float. Ships are bigger than boats. Boats are slower than airplanes. Boats only travel on water.
> Sounds like a list of edge cases just like any other area.
That's exactly the point. The famous example (Falsehoods Programmers Believe About Names) has examples I have encountered in medical databases. If a programmer somewhere didn't fall into the trap, patient names in a medical database would have been better managed and may have avoided duplication, lost records, etc.
https://news.ycombinator.com/item?id=18567548
> Ships are bigger than boats
to the best of my recollection, the only way to tell a ship from a boat is to watch it make a "high" speed turn, ships lean out, boats lean in. But this is probably incorrect, just like all of my education was.
The only one I know is that ships can carry boats, not vice versa.
Ships can carry ships, though. But then the ship becomes a boat?
ships use ports, boats use docks.
Its like receiving some API documentation that confidently declares some field as an ENUM and then a few hundred million rows later you discover that that was more like a suggestion and its actually more like a free text field.. sigh
>Airports never move
I can imagine them going "I had a perfect database schema that covered every edge case, and then..." with each bullet point.
Exactly I think that’s the point! Trying to make a strongly typed model, APIs and templates, etc… all while reality is making other plans
Thats when you ask the user to add another airport with the same name and -2 at the end. Add a "has moved" field!
Now you have to specify whether or not it’s moved during queries (and what if it moves again?) There’s probably a more elegant way I’m not thinking of, but standard created_at and updated_at fields would work: if a given date is <= the move date, it’s the original airport, else the new one. Rinse and repeat if it moves again.
The reassignment of a live IATA code between two airports was absolutely cluster fuck level.
This had never happened before.
Like, you don't even _change_ the IATA code of a live airport. To switch them was a huuuuuuuuuge assumption breaker for the industry.
The biggest falsehood programmers believe is “a data restriction or programmer difficulty will have any affect on management decisions”.
Missing "Aircrafts land", the shortest falsehood.
Which aircraft don't land in one form or another?
Spacecraft are aircraft. Voyager hasn’t landed.
Touchdown on water, on an air carrier, rapid disassembly in air
Another falsehood is that airplane data companies won't cave to legal or monetary threats. They might!
https://www.airnavradar.com/data/airlines/tmw is a good example of some of these (depending on what time you check that link -- if it's night-time in the Maldives it's going to show you nothing)
Someone should do a logic blogpost - “Falsehoods programmers believe about falsehoods”
Here's a falsehood list about falsehood lists:
1. I'll never need to learn a falsehood list, so I can skip it.
2. A falsehood list is complete at the time of writing.
3. OK, but it will surely get updated with new falsehoods and clarifications.
4. Skimming the falsehood list is all I need to do to learn it.
5. OK, but surely I'll remember to recheck the falsehood list once I actually need to, right?
6. If a falsehood doesn't immediately make sense to me, there must be something wrong with it, despite the author having domain expertise that I don't.
Literally had to point out just last night how UTC is not sufficient in all scenarios. I swear it happens every 6 mos on Reddit.
7. These lists are meant to be for entertainment only
I did: https://github.com/kdeldycke/kevin-deldycke-blog/blob/main/c...
Why? What would go on it?
From that dead comment quoting a chat bot that clearly did not understand the question at all, I think maybe we can extract a single bullet point:
* “Edge cases” live only at the edges; they never creep into the middle.
But that's not much to build a post with.
"Falsehoods "falsehoods programmers believe about falsehoods" blog posters believe about "falsehoods programmers believe about "falsehoods programmers believe about logic" blog" falsehoods"
[flagged]
[dead]
[dead]
[flagged]
[flagged]
I did not believe almost any of these things on account of I try not to think about airports whenever I don't have to.
Day by day it feels less and less like regular data modeling and more like a debate with Jordan Peterson where you argue for ten hours what a "name" is.
Eventually you end up having to make choices and deal with the consequences. Otherwise Jordan Peterson would have you chasing your tail for days about what a "choice" is, and nothing would ever get done.
tl;dr: just make your best guess and always include an extra "notes" column where things can get leaky.
Not days necessarily, but I think quite a bit of time should be spent data modeling, yes. Before you’ve ever touched the keyboard, it’s very helpful to attempt to model the problem on paper or a whiteboard. You quickly find problems with your initial guess that way.
Notes / data / extra et. al columns are the worst, as a DBRE. People inevitably shove various shit into them over time instead of making an effort to properly fix past mistakes, and at some point, they practically contain their own table.
Bit of a rant: what annoys me about these lists is how they just give off a huge "you are dumb for making any assumptions, how could you not think of <extremely obscure edge case>" vibe. I'd be interested to see what the effects are of these assumptions failing, because often they are pretty reasonable assumptions for a reasonable subset of the universe. Software is imperfect and you can't cover every possibility. Like ok technically 10 flights with the same number could leave the same gate at the same time, but if 99.99% of the time they don't and you assume that, what is the real impact to people?
Reminds me of a list that came up ages ago that presented an assumption of "X code always runs" with the counterpoint that you could unplug the computer. Ok sure, but then why write software at all? Clearly no point assuming any code will ever run since you can just terminate the program at any random time.
I don't agree that this list has the attitude you describe--if anything, they just seem proud that they have many fewer of these corner case bugs than anyone else--so it is difficult to work with your example of the flight number. These are, in fact, misconceptions made by programmers, often without having the in-depth knowledge of this specific area that comes from being an actual expert (the kind that often people don't allocate for in their budgets), and this list isn't an over-the-top portrayal of such: it feels weird to become offended?
That said, I do appreciate some of these lists--which maybe has put you on edge to the paradigm--do have an edge to them... but, in all honesty, I think they should? The bugs and edge cases that these lists tend to expose aren't random glitches that equally affect every user: they usually segment users into the ones whose lives "follow the happy path" (which often just means "are intuitive and familiar to the culture near the developer") and the users who get disproportionately (or even continually!) screwed every time they dare interact with a computer.
And like, it is actually a problem that the other side of this is almost always a developer who doesn't really give a shit and considers that user's (or even an entire region/country's) existence to somehow be a negligible statistic not worth their time or energy, and I really do think that they deserve to take some flak for that (the same way I try to not get offended if someone points out how my being a cis-het white male blinds me to stuff: I think I deserve to get held to task harder by frustrated minorities rather than force them to be nice all the time in a world that penalizes them).
I don't disagree with you at all. My point was more like what another commenter said, that software adheres to a strict and very finite set of rules, the real world is way more complicated than that. It's so trivially easy to find real world counterexamples to just about any software that it's a barely interesting exercise (IMO). So you define a reasonable subset and work with that. And the reasonable subset is probably defined by positive/negative outcomes.
It would have been cool if the blog post discussed those outcomes so we can reason about it properly, otherwise it's just a list of claims at face value. If the programmer making an assumption means a screen at a gate says the wrong boarding time when there's a human there controlling the boarding, then not the end of the world. But if the programmer making an assumption causes 1/10000 flights to crash, then that's interesting and worthwhile calling out. It's just endless speculation without a proper outcome to tie it down.
At a general level I think these lists make developers more aware of uniqueness and constraints.
When designing data I think these questions (skepticisms) should be front of mind;
1) natural values are not unique.
2) things identified by number are best stored as a string. If you're not going to do math on it, it's not a number. That "customer number" should be treated as "customer id" and as a string.
3) be careful constraining data. Those "helpful checks" to make sure the "zip code is valid" are harmful not helpful.
4) those tiny edge cases may "almost never happen" but they will end up consuming your support department. Challenge your own assumptions at every possible opportunity. Never assume anything you "know" is true.
It's hard to measure time saved, and problems avoided, with good design. But it's easy to see bad design as it plays out over decades.
And (especially today) never optimize design for "size". Y2K showed that folly once and for all.
> 2)
This implies denormalization, which is rarely needed for performance, despite what so many believe. Now you’ve introduced referential integrity issues, and have taken a huge performance hit at scale.
> 3)
I mean, maybe don’t try to use a regex on an email address beyond “is there a local and domain portion,” but a ZIP code, as in U.S. only, seems pretty straightforward to check. I would much rather have to update a check constraint if proven wrong than to risk bad data the rest of the time.
> never optimize for size
Optimize for size when it doesn’t introduce other issues. Anyone working on 2-digit years could have and likely did see that issue, but opted to ignore it for various reasons (“not my problem,” etc.). But for example, _especially_ since Postgres has a native type for IP addresses, there is zero reason to store them as strings in dotted quad. Even if you have MySQL, store them as a UINT32, and use its built-in functions to cast back and forth.
>It's so trivially easy to find real world counterexamples to just about any software that it's a barely interesting exercise (IMO).
These lists hopefully make programmers aware that a lot of their assumptions about the real world might be wrong, or at least questionable.
Examples are assumptions on the local part of email addresses without checking the appropriate RFCs. Which then get enshrined in e.g. JavaScript libraries which everyone copies. I've been annoyed for the last 30 years by websites where the local part is expected to be composed of only [a-z0-9_-] although the plus sign (and many other characters) are valid constituents of a local part.
Or assumptions on telephone numbers. Including various ways (depending on local culture) of structuring their notation, e.g. "123 456 789" versus "12-3456-89" where software is too dumb to just ignore spaces or dashes, or even a stray whitespace character copied by accident with the mouse.
And those forms where you have to enter a credit card (or bank account number) in fields of n characters each, which makes cut/copy/paste difficult because you notes contain it in the "wrong" format.
So while some examples may count as "just usability" it all stemps from naive assumptions by programmers who think one size fits all (it doesn't).
I disagree, in my view they do not inherently give off such vibes at all. In this post for example, they specifically broach the topic like so:
> There are a lot of assumptions one could make when designing data types and schemas for aviation data that turn out to be inaccurate.
Sounds like a pretty explicit acknowledgement of the notion that these are otherwise reasonable assumptions that just happen to fail when put to the test, I'd say.
It's very easy to self-deprecate, especially if one has insecurities. But that doesn't mean that articles like this actually mean to do so. I think it's worthwhile for everyone involved to always evaluate whether the feeling is actually coming from the source you're looking at, or if that source just happened to trigger it inside you. More often than not, in my anecdotal experience, it's the latter.
I'd also find it interesting to learn what happens when these falsehoods nonetheless make it into an implementation though.
> I'd be interested to see what the effects are of these assumptions failing
Mostly confusion, but the combination of aviation and confusion can be dangerous and even deadly. Not directly related to this list, but I'm reminded of [1]: no one entity has set out to inconvenience the hapless traveler, but the combination of history and practice are a constant source of irritation, and at the times of heightened tensions and security might even lead to scary incidents. All because of the name.
[1] https://travel.stackexchange.com/questions/149323/my-name-ca...
Usually I use lists like this to define design constraints. This sort of thing becomes a template for the tables in the database.
[flagged]
It's never possible to downvote submissions (only comments), but you can flag them if you think they're unfit for HN.
[flagged]
Please don't hector someone like this when someone has shared an interesting story from on their unique professional experience.
We detached this comment from https://news.ycombinator.com/item?id= 44207171 and marked it off-topic.