janus.embedding.vectorize#

Classes#

Vectorizer

Class for creating embeddings/vectors in a specified ChromaDB

VectorizerFactory

Interface for creating a Vectorizer independent of type of ChromaDB client

ChromaDBVectorizer

Factory for Vectorizer that uses ChromaEmbeddingDatabase

Module Contents#

class janus.embedding.vectorize.Vectorizer(client, config=None)#

Bases: object

Class for creating embeddings/vectors in a specified ChromaDB

Initializes the Vectorizer class

Parameters:
  • client (chromadb.Client) – ChromaDB client instance

  • config (Optional[Dict[str, Any]]) –

get_or_create_collection(name, model_name=None)#
Parameters:
Return type:

langchain_community.vectorstores.Chroma

create_collection(embedding_type, model_name=None)#
Parameters:
Return type:

langchain_community.vectorstores.Chroma

collections(name=None)#
Parameters:

name (None | janus.utils.enums.EmbeddingType | str) –

Return type:

Sequence[chromadb.Collection]

add_nodes_recursively(code_block, collection_name, file_name)#

Embed all nodes in the tree rooted at code_block

Parameters:
Return type:

None

add_text(collection_name, texts, metadatas, ids=None)#

Helper function that stores a single text (in an array) and associated metadatas, returning the embedding id

Parameters:
  • collection_name (janus.utils.enums.EmbeddingType | str) – Collection to add to

  • texts (list[str]) – list of texts to store

  • metadatas (list[dict]) – list of metadatas to store

  • ids (list[str]) – list of embedding ids (must match lengh of texts), generated if not given by caller

Returns:

list of embedding ids. Raises ValueError if collection not found.

Return type:

list[str]

property config#
class janus.embedding.vectorize.VectorizerFactory#

Bases: abc.ABC

Interface for creating a Vectorizer independent of type of ChromaDB client

abstract create_vectorizer(path, config={})#

Factory method

Parameters:
Return type:

Vectorizer

class janus.embedding.vectorize.ChromaDBVectorizer#

Bases: VectorizerFactory

Factory for Vectorizer that uses ChromaEmbeddingDatabase

create_vectorizer(path=Path.home() / '.janus' / 'chroma' / 'chroma-data', config=None)#
Parameters:
  • path (str | pathlib.Path) – The path to the ChromaDB. Can be either a string of a URL or path or a Path object

  • Returns – Vectorizer

  • config (Optional[Dict[str, Any]]) –

Return type:

Vectorizer