janus.language.naive.tag_splitter#
Classes#
Splits code by tags inserted into code |
Module Contents#
- class janus.language.naive.tag_splitter.TagSplitter(tag, *args, **kwargs)#
Bases:
janus.language.splitter.Splitter
Splits code by tags inserted into code
- Parameters:
language – The name of the language to split.
model – The name of the model to use for counting tokens. If the model is None, will use tiktoken’s default tokenizer to count tokens.
max_tokens – The maximum number of tokens to use for each functional block.
skip_merge –
Whether to merge child nodes up to the max_token length. May be used for situations like documentation where function-level documentation is preferred. TODO: Maybe instead support something like a list of node types that
shouldnt be merged (e.g. functions, classes)?
prune_unprotected – Whether to prune unprotected nodes from the tree.
tag (str) –