Notebook Examples#
Below is a compilation of tutorials focused on understanding and utilizing Feature Stores. You can find the raw notebook files in our tutorials repository.
Quick start examples#
Notebook Examples
Jupyter Notebook |
Description |
|---|---|
* Ingestion of data.
* Querying and exploration of data.
|
|
* Ingestion of data.
* Querying and exploration of data.
|
|
*
Schema evolution allows you to easily change a table’s current schema to accommodate data that is changing over time.*
Schema enforcement, also known as schema validation, is a safeguard in Delta Lake that ensures data quality by rejecting writes to a table that don’t match the table’s schema. |
|
Example to demonstrate storage of medical records in Feature Store.
|
Big data operations using OCI DataFlow#
Notebook Examples
Jupyter Notebook |
Description |
|---|---|
* Ingestion of data using Spark Magic.
* Querying and exploration of data using Spark Magic.
|
Streaming operations using Spark Streaming#
Notebook Examples
Jupyter Notebook |
Description |
|---|---|
* Ingestion of data using spark streaming.
* Modes of ingestion:
COMPLETE and APPEND. |
LLM Use cases#
Notebook Examples
Jupyter Notebook |
Description |
|---|---|
*
Embedding feature stores are optimized for fast and efficient retrieval of embeddings. This is important because embeddings can be high-dimensional and computationally expensive to calculate. By storing them in a dedicated store, you can avoid the need to recalculate embeddings for the same data repeatedly. |
|
Synthetic data generation in feature store using OpenAI and FewShotPromptTemplate |
*
Synthetic data is artificially generated data, rather than data collected from real-world events. It’s used to simulate real data without compromising privacy or encountering real-world limitations. |
PII Data redaction, Summarise Content and Translate content using doctran and open AI |
* One way to think of Doctran is a LLM-powered black box where messy strings go in and nice, clean, labelled strings come out. Another way to think about it is a modular, declarative wrapper over OpenAI’s functional calling feature that significantly improves the developer experience.
|
Embedding feature stores are optimized for fast and efficient retrieval of embeddings. This is important because embeddings can be high-dimensional and computationally expensive to calculate. By storing them in a dedicated store, you can avoid the need to recalculate embeddings for the same data repeatedly. |