In 2014, we built a similar type event-driven system (but specifically for document distribution (a document can be distributed to a target set of entities; if a new entity is added, we need to resolve which distributions match)) and also ended up using Cypher via Neo4j (because of the complex taxonomical structure of how we mapped entities).
It is a super underrated query language and while most of the queries could also be translated to relational SQL, Cypher's linear construction using WITH clauses is far, far easier to reason about, IMO.
EDIT: feel like the devs went overboard with the mix of languages. Shoehorned in C# Blazor? Using JS and Jest for e2e testing?
> while most of the queries could also be translated to relational SQL, Cypher's linear construction using WITH clauses is far, far easier to reason about, IMO.
Didn't look too deeply, but one of the keys with Cypher (at least in the context of graph databases) is that it has a nice way of representing `JOIN` operations as graph traversals.
MATCH (p:Person)-[r]-(c:Company) RETURN p.Name, c.Name
Where `r` can represent any relationship (AKA `JOIN`) between the two collections `Person` and `Company` such as `WORKS_AT`, `EMPLOYED_BY`, `CONTRACTOR_FOR`, etc.
So I'd say that linear queries are one of the things I like about Cypher, but the clean abstraction of complex `JOIN` operations is another huge one.
> […] Where `r` can represent any relationship […]
… and «-[r]-» can represent any relationship direction, which obviates the need for constructing separate queries for inverse traversing relationships. Kinda like running a compiler forward and backward.
We made a health backend partly using Cypher and the only thing I found was the simple queries looked amazing, but as soon as you need to join non-linearly it started looking a lot like SQL again. And when you're using an ORM it stops mattering. And when you need migrations it gets painful!
At least in our use case, even with some very gnarly 20+ line Cypher queries, it never got to the point where it felt like SQL and certainly, those same queries would be even gnarlier as nested sub-selects, CTEs, or recursive selects, IMO.
Perhaps a characteristic of our model (a taxonomy of Region, Country, Sponsor, Program, Trial, Site, Staff for global clinical trials and documents required by Region/Country/Program/Trial).
If you haven't been following it, I recently found out that it is now supported in a limited capacity by Google Spanner[0]. The openCypher initiative started a few years back and it looks like it's evolved into the (unfortunate moniker) GQL[1].
So it may be the case that we'll see more Cypher out in the wild.
I finished reading Kleppman's Designing Data-Intensive Applications last night and this looks like it's straight out of the last chapter that talk about the future. They don't use the term "dataflow" though.
> Installing Drasi in an EKS cluster can be significantly more complex than a standard installation on other platforms. Instead of downloading a CLI binary using the provided installation scripts, this approach requires modifying the source code of the Drasi CLI and building a local version of the CLI.
Is this an actual requirement or just the current easy path?
Azure SRE here, it doesn't appear to have any Azure dependencies. CLI rebuild seems to be that "drasi init" assumes Azure Kubernetes Service built in StorageClasses for Kubernetes PVC for Redis and Mongo and thus fails when running against EKS. I assume same thing would be required on GKE. Yes, it should be more modular but MVP.
As for other stuff, it's using Gremlin Query Language or Postgres which are both open. In fact, it's going out of way it's not to use Azure authenication as loading connection string as Kubernetes secret is 100% AGAINST Azure Kubernetes Best Practice. Best Practice would be Workload Identity.
> CLI rebuild seems to be that "drasi init" assumes Azure Kubernetes Service built in StorageClasses for Kubernetes PVC for Redis and Mongo and thus fails when running against EKS. I assume same thing would be required on GKE. Yes, it should be more modular but MVP.
Every bit of Microsoft open source is created at least partly as a sales strategy for Azure. They usually start within the Azure infrastructure because, well, why wouldn’t they? Then eventually they tend make it to where you can use them outside of Azure but they never quite leave the part where they are “better” if you’re an Azure customer.
Time will tell if Drasi is going to go the path where it becomes more easily useable outside of Azure (and in this case AWS) or it’ll go more of a Bicep route.
Oh this very much reminds me of [feldera](https://feldera.com) — they do incremental loads and computations using some novel approaches (most of which i am too dumb to follow). Really nice folks too.
I took a brief look into Drasi and it looks like it doesn't do any of the differential/timely dataflow stuff (like Materialize does), or any other sophisticated incremental view maintenance methods that are rooted in Microsoft Research.
Drasi...React...well played Microsoft, well played :D
Assuming they choose this name from the Greek δράση which means action, React of course is the exact opposite to action, thus the React-ion; an action expects a reaction, somewhere somehow!
This is a very solid pattern. Many systems that are built using traditional relational database systems would lend themselves to far simpler designs using this paradigm. It is not necessarily immediately obvious but nonetheless quite true.
Very interesting choice of using Cypher[0]
In 2014, we built a similar type event-driven system (but specifically for document distribution (a document can be distributed to a target set of entities; if a new entity is added, we need to resolve which distributions match)) and also ended up using Cypher via Neo4j (because of the complex taxonomical structure of how we mapped entities).
It is a super underrated query language and while most of the queries could also be translated to relational SQL, Cypher's linear construction using WITH clauses is far, far easier to reason about, IMO.
EDIT: feel like the devs went overboard with the mix of languages. Shoehorned in C# Blazor? Using JS and Jest for e2e testing?
[0] https://drasi.io/reference/query-language/
> while most of the queries could also be translated to relational SQL, Cypher's linear construction using WITH clauses is far, far easier to reason about, IMO.
https://prql-lang.org/
Didn't look too deeply, but one of the keys with Cypher (at least in the context of graph databases) is that it has a nice way of representing `JOIN` operations as graph traversals.
Where `r` can represent any relationship (AKA `JOIN`) between the two collections `Person` and `Company` such as `WORKS_AT`, `EMPLOYED_BY`, `CONTRACTOR_FOR`, etc.So I'd say that linear queries are one of the things I like about Cypher, but the clean abstraction of complex `JOIN` operations is another huge one.
> […] Where `r` can represent any relationship […]
… and «-[r]-» can represent any relationship direction, which obviates the need for constructing separate queries for inverse traversing relationships. Kinda like running a compiler forward and backward.
[dead]
We made a health backend partly using Cypher and the only thing I found was the simple queries looked amazing, but as soon as you need to join non-linearly it started looking a lot like SQL again. And when you're using an ORM it stops mattering. And when you need migrations it gets painful!
Perhaps a characteristic of our model (a taxonomy of Region, Country, Sponsor, Program, Trial, Site, Staff for global clinical trials and documents required by Region/Country/Program/Trial).
[dead]
[dead]
I too have great memories of cypher. Such an elegant way to write queries.
If you haven't been following it, I recently found out that it is now supported in a limited capacity by Google Spanner[0]. The openCypher initiative started a few years back and it looks like it's evolved into the (unfortunate moniker) GQL[1].
So it may be the case that we'll see more Cypher out in the wild.
[0] https://cloud.google.com/spanner/docs/graph/opencypher-refer...
[1] https://neo4j.com/blog/cypher-gql-world/
I finished reading Kleppman's Designing Data-Intensive Applications last night and this looks like it's straight out of the last chapter that talk about the future. They don't use the term "dataflow" though.
https://www.oreilly.com/library/view/designing-data-intensiv...
That one’s also on my reading list. Was it worth the read?
This book is definitely worth the read. Or maybe worth 10 reads. Its really that awesome!
Is this what can be done with Apache Kafka Connect (to get data from another source to a Kafka cluster), Kafka (including Kafka Streams)? This image (https://github.com/drasi-project/community/raw/main/images/d...) is like Kafka Streams with a single topic. This image (https://github.com/drasi-project/community/raw/main/images/c...) is like joining 2 streams in Kafka Streams.
Looks very Azure-centric. Both installation guides (https://drasi.io/how-to-guides/install-sample-applications/b... and https://drasi.io/how-to-guides/install-sample-applications/c...) require Azure to work.
And then there's this:
> Installing Drasi in an EKS cluster can be significantly more complex than a standard installation on other platforms. Instead of downloading a CLI binary using the provided installation scripts, this approach requires modifying the source code of the Drasi CLI and building a local version of the CLI.
Is this an actual requirement or just the current easy path?
Azure SRE here, it doesn't appear to have any Azure dependencies. CLI rebuild seems to be that "drasi init" assumes Azure Kubernetes Service built in StorageClasses for Kubernetes PVC for Redis and Mongo and thus fails when running against EKS. I assume same thing would be required on GKE. Yes, it should be more modular but MVP.
As for other stuff, it's using Gremlin Query Language or Postgres which are both open. In fact, it's going out of way it's not to use Azure authenication as loading connection string as Kubernetes secret is 100% AGAINST Azure Kubernetes Best Practice. Best Practice would be Workload Identity.
> CLI rebuild seems to be that "drasi init" assumes Azure Kubernetes Service built in StorageClasses for Kubernetes PVC for Redis and Mongo and thus fails when running against EKS. I assume same thing would be required on GKE. Yes, it should be more modular but MVP.
None of these words are in the Bible.
Every bit of Microsoft open source is created at least partly as a sales strategy for Azure. They usually start within the Azure infrastructure because, well, why wouldn’t they? Then eventually they tend make it to where you can use them outside of Azure but they never quite leave the part where they are “better” if you’re an Azure customer.
Time will tell if Drasi is going to go the path where it becomes more easily useable outside of Azure (and in this case AWS) or it’ll go more of a Bicep route.
That is usual for new Microsoft open source projects. It takes 1-2 months for the Azure dependencies to go away.
I'm curious about the other examples? I get it though, as many of these projects are built fulfilling a specific need within MS infrastructure.
Does it require Azure to work? Or could the Azure steps be relatively easily be swapped out for AWS/GCP/etc?
Azure is the new Windows, as timesharing OS, thus yeah that is to be expected.
I see more Cypher fans out here - check out https://cyphernet.es if you work with Kubernetes!
Oh this very much reminds me of [feldera](https://feldera.com) — they do incremental loads and computations using some novel approaches (most of which i am too dumb to follow). Really nice folks too.
Or the related Materialize stuff https://materialize.com/
I took a brief look into Drasi and it looks like it doesn't do any of the differential/timely dataflow stuff (like Materialize does), or any other sophisticated incremental view maintenance methods that are rooted in Microsoft Research.
Drasi...React...well played Microsoft, well played :D
Assuming they choose this name from the Greek δράση which means action, React of course is the exact opposite to action, thus the React-ion; an action expects a reaction, somewhere somehow!
Not like Microsoft to name things well...
VMS++ = Windows NT?
https://azure.microsoft.com/en-us/blog/drasi-microsofts-newe...
This is a very solid pattern. Many systems that are built using traditional relational database systems would lend themselves to far simpler designs using this paradigm. It is not necessarily immediately obvious but nonetheless quite true.
Beginning with Boolean operators: and / or - this relational service model can distribute queries. Curious why Cypher [0] abandons this syntax.
Cypher is so cool. I included a graph database in my RAG patient chatbot
https://github.com/SiddanthEmani/patient_chatbot
Go seem to be good choice for data processing systems.
I wish I could use Cypher for everything
What does it process it from and what does it process it to?
Is it programmable or you have a concrete concept theorised?
What is it useful for? How it helps business in saving cost or increasing profit? Is it a hobby project?
[flagged]
Purple!
Green!
Just my luck. I get stuck with a race that speaks only in macros.
But at what cost?