ericmcer an hour ago

Can anyone explain why Netflix is considered to have such high tier engineering? Just from a super high level view they store and serve ~5000 videos saved at a few different qualities (4?) so lets say a total of 20,000 videos. Those files only change when specific privileged users update them.

Compare that with Youtube where ~5,000 videos are uploaded, processed into different formats/qualities every minute, and can be added by anyone with an email. It seems like Netflix has a fairly trivial problem when compared with video sharing or content sharing sites.

  • jolynch an hour ago

    My experience has been that the talent density is the main difference. Netflix tackles huge problems with a small number of engineers. I think one angle of complexity you may be missing is efficiency - both in engineering cost and infrastructure cost.

    Also YouTube has _excellent_ engineering (e.g. Vitess in the data space), and they are building atop an excellent infrastructure (e.g. Borg and the godly Google network). It's worth noting though that the whole Netflix infrastructure team is probably smaller than a small to medium satellite org at Google.

  • NBJack an hour ago

    Hype for the engineering culture? Helps attract the right talent. It is a relatively small team that is...ah, heavily motivated to come up with good solutions around the clock. And they maintain an excellent tech blog.

    Don't get me wrong; serving the level of traffic they handle isn't easy to scale or do cost-effectively around the globe. They are also considered by some to be pioneers in chaos engineering, and made headlines years ago making a competition to find the "best" suggestion algorithm.

  • loire280 an hour ago

    You're probably right, but Netflix does a good job building their engineering brand by writing up and sharing their technical work publicly.

snicker7 4 hours ago

This API is very similar to DynamoDB, which is basically a hash table of B-trees.

My experience is that this architecture can lead to very chatty applications if you have a rich data model (eg a graph).

  • jolynch 2 hours ago

    (post author)

    It is indeed similar to DynamoDB as well as the original Cassandra Thrift API! This is intentional since those are both targeted backends and we need to be able to migrate customers between Cassandra Thrift, Cassandra CQL and DynamoDB. One of the most important things we use this abstraction for is seamless migration [1] as use cases and offerings evolve. Rather than think of KeyValue as the only database you ever need, think of it like your language's Map interface, and depending on the problem you are solving you need different implementations of that interface (different backing databases).

    Graphs are indeed a challenge (and Relational is completely out of scope), but the high-scale Netflix graph abstraction is actually built atop KV just like a Graph library might be built on top of a language's built in Map type.

    [1] https://www.youtube.com/watch?v=3bjnm1SXLlo

jerf 2 hours ago

For anyone looking for a TL;DR, I'd suggest starting at https://netflixtechblog.com/introducing-netflixs-key-value-d... , which HN is truncating so you can't see it but I've directly linked to a later section in the post with a #. Up to that point it's basically "a networked HashMap<String, SortedMap<Bytes, Bytes>>". But the ability to return partial results based on a timeout with a pagination token is somewhat unusual and the next section called "Signaling" is at least worth a look.