janus.language.splitter#
Attributes#
Exceptions#
An exception raised when the token limit is exceeded and the code cannot be |
|
An exception raised when the tree is empty or does not exist (can happen |
|
An exception raised when the file size is too large for the splitter |
Classes#
A class for splitting code into functional blocks to prompt with for |
Module Contents#
- janus.language.splitter.log#
- exception janus.language.splitter.TokenLimitError#
Bases:
Exception
An exception raised when the token limit is exceeded and the code cannot be split into smaller blocks.
Initialize self. See help(type(self)) for accurate signature.
- exception janus.language.splitter.EmptyTreeError#
Bases:
Exception
An exception raised when the tree is empty or does not exist (can happen when there are no nodes of interest in the tree)
Initialize self. See help(type(self)) for accurate signature.
- exception janus.language.splitter.FileSizeError#
Bases:
Exception
An exception raised when the file size is too large for the splitter
Initialize self. See help(type(self)) for accurate signature.
- class janus.language.splitter.Splitter(language, model=None, max_tokens=4096, skip_merge=False, protected_node_types=(), prune_node_types=(), prune_unprotected=False)#
Bases:
janus.language.file.FileManager
A class for splitting code into functional blocks to prompt with for transcoding.
- Parameters:
language (str) – The name of the language to split.
model (janus.llm.models_info.JanusModel | None) – The name of the model to use for counting tokens. If the model is None, will use tiktoken’s default tokenizer to count tokens.
max_tokens (int) – The maximum number of tokens to use for each functional block.
skip_merge (bool) –
Whether to merge child nodes up to the max_token length. May be used for situations like documentation where function-level documentation is preferred. TODO: Maybe instead support something like a list of node types that
shouldnt be merged (e.g. functions, classes)?
prune_unprotected (bool) – Whether to prune unprotected nodes from the tree.
- split(file)#
Split the given file into functional code blocks.
- Parameters:
file (pathlib.Path | str) – The file to split into functional blocks.
- Returns:
A CodeBlock made up of nested `CodeBlock`s.
- Return type:
- split_string(code, name)#
Split the given code into functional code blocks.
- Parameters:
- Returns:
A CodeBlock made up of nested `CodeBlock`s.
- Return type:
- merge_nodes(nodes)#
Merge a list of nodes into a single node. The first and last nodes’ respective prefix and suffix become this node’s affixes.
- Parameters:
nodes (List[janus.language.block.CodeBlock]) –
- Return type: