README
Last updated
Last updated
Blurr transforms structured, streaming raw data
into features
for model training and prediction using a high-level expressive YAML-based language
called the Blurr Transform Spec (BTS). The BTS merges the schema and computation model for data processing.
The BTS is a data transform definition for structured data. The BTS encapsulates the business logic of data transforms and Blurr orchestrates the execution of data transforms. Blurr is runner-agnostic, so BTSs can be run by event processors such as Spark, Spark Streaming or Flink.
Yes, if: you are well on your way on the ML 'curve of enlightenment', and are thinking about how to do online scoring
Coming up with features is difficult, time-consuming, requires expert knowledge. 'Applied machine learning' is basically feature engineering --- Andrew Ng
Preparing data for specific use cases using Blurr:
Welcome to the Blurr community! We are so glad that you share our passion for building MLOps!
Data pipelines are versioned and reproducible
Pipelines (re)build in one step
Deploying to production needs minimal engineering help
Successful ML is a long game. You play it like it is
Kaizen. Experimentation and iterations are a way of life
Blurr is currently in Developer Preview. Stay in touch!: Star this project or email hello@blurr.ai
Local transformations only
Support for custom functions and other python libraries in the BTS
Spark runner
S3 support for data sink
DynamoDB as an Intermediate Store
Features server
|
Please create a to begin a discussion. Alternatively, feel free to pick up an existing issue!
Please sign the before raising a pull request.
Inspired by the (old school) to rate software teams, here's our version for data science teams. What's your score?