Provider

Llama Cpp Server

`llama_cpp_agent.providers.llama_cpp_server`

`LlamaCppSamplingSettings` `dataclass`

Bases: LlmSamplingSettings

Settings for generating completions using the Llama.cpp server.

Parameters:

temperature (float, default: 0.8 ) –

Controls the randomness of the generated completions. Higher values make the output more random.
top_k (int, default: 40 ) –

Controls the diversity of the top-k sampling. Higher values result in more diverse completions.
top_p (float, default: 0.95 ) –

Controls the diversity of the nucleus sampling. Higher values result in more diverse completions.
min_p (float, default: 0.05 ) –

Minimum probability for nucleus sampling. Lower values result in more focused completions.
n_predict (int, default: -1 ) –

Number of completions to predict. Set to -1 to use the default value.
n_keep (int, default: 0 ) –

Number of completions to keep. Set to 0 for all predictions.
stream (bool, default: True ) –

Enable streaming for long completions.
additional_stop_sequences (List[str], default: None ) –

List of stop sequences to finish completion generation. The official stop sequences of the model get added automatically.
tfs_z (float, default: 1.0 ) –

Controls the temperature for top frequent sampling.
typical_p (float, default: 1.0 ) –

Typical probability for top frequent sampling.
repeat_penalty (float, default: 1.1 ) –

Penalty for repeating tokens in completions.
repeat_last_n (int, default: -1 ) –

Number of tokens to consider for repeat penalty.
penalize_nl (bool, default: False ) –

Enable penalizing newlines in completions.
presence_penalty (float, default: 0.0 ) –

Penalty for presence of certain tokens.
frequency_penalty (float, default: 0.0 ) –

Penalty based on token frequency.
penalty_prompt (Union[None, str, List[int]], default: None ) –

Prompts to apply penalty for certain tokens.
mirostat_mode (int, default: 0 ) –

Mirostat level.
mirostat_tau (float, default: 5.0 ) –

Mirostat temperature.
mirostat_eta (float, default: 0.1 ) –

Mirostat eta parameter.
seed (int, default: -1 ) –

Seed for randomness. Set to -1 for no seed.
ignore_eos (bool, default: False ) –

Ignore end-of-sequence token.

Attributes:

temperature (float) –

Controls the randomness of the generated completions. Higher values make the output more random.
top_k (int) –

Controls the diversity of the top-k sampling. Higher values result in more diverse completions.
top_p (float) –

Controls the diversity of the nucleus sampling. Higher values result in more diverse completions.
min_p (float) –

Minimum probability for nucleus sampling. Lower values result in more focused completions.
n_predict (int) –

Number of completions to predict. Set to -1 to use the default value.
n_keep (int) –

Number of completions to keep. Set to 0 for all predictions.
stream (bool) –

Enable streaming for long completions.
additional_stop_sequences (List[str]) –

List of stop sequences to finish completion generation. The official stop sequences of the model get added automatically.
tfs_z (float) –

Controls the temperature for top frequent sampling.
typical_p (float) –

Typical probability for top frequent sampling.
repeat_penalty (float) –

Penalty for repeating tokens in completions.
repeat_last_n (int) –

Number of tokens to consider for repeat penalty.
penalize_nl (bool) –

Enable penalizing newlines in completions.
presence_penalty (float) –

Penalty for presence of certain tokens.
frequency_penalty (float) –

Penalty based on token frequency.
penalty_prompt (Union[None, str, List[int]]) –

Prompts to apply penalty for certain tokens.
mirostat_mode (int) –

Mirostat level.
mirostat_tau (float) –

Mirostat temperature.
mirostat_eta (float) –

Mirostat eta parameter.
seed (int) –

Seed for randomness. Set to -1 for no seed.
ignore_eos (bool) –

Ignore end-of-sequence token.

Methods: save(file_path: str): Save the settings to a file. load_from_file(file_path: str) -> LlamaCppServerGenerationSettings: Load the settings from a file. load_from_dict(settings: dict) -> LlamaCppServerGenerationSettings: Load the settings from a dictionary. as_dict() -> dict: Convert the settings to a dictionary.

Source code in llama_cpp_agent/providers/llama_cpp_server.py

@dataclass
class LlamaCppSamplingSettings(LlmSamplingSettings):
    """
    Settings for generating completions using the Llama.cpp server.

    Args:
        temperature (float): Controls the randomness of the generated completions. Higher values make the output more random.
        top_k (int): Controls the diversity of the top-k sampling. Higher values result in more diverse completions.
        top_p (float): Controls the diversity of the nucleus sampling. Higher values result in more diverse completions.
        min_p (float): Minimum probability for nucleus sampling. Lower values result in more focused completions.
        n_predict (int): Number of completions to predict. Set to -1 to use the default value.
        n_keep (int): Number of completions to keep. Set to 0 for all predictions.
        stream (bool): Enable streaming for long completions.
        additional_stop_sequences (List[str]): List of stop sequences to finish completion generation. The official stop sequences of the model get added automatically.
        tfs_z (float): Controls the temperature for top frequent sampling.
        typical_p (float): Typical probability for top frequent sampling.
        repeat_penalty (float): Penalty for repeating tokens in completions.
        repeat_last_n (int): Number of tokens to consider for repeat penalty.
        penalize_nl (bool): Enable penalizing newlines in completions.
        presence_penalty (float): Penalty for presence of certain tokens.
        frequency_penalty (float): Penalty based on token frequency.
        penalty_prompt (Union[None, str, List[int]]): Prompts to apply penalty for certain tokens.
        mirostat_mode (int): Mirostat level.
        mirostat_tau (float): Mirostat temperature.
        mirostat_eta (float): Mirostat eta parameter.
        seed (int): Seed for randomness. Set to -1 for no seed.
        ignore_eos (bool): Ignore end-of-sequence token.

    Attributes:
        temperature (float): Controls the randomness of the generated completions. Higher values make the output more random.
        top_k (int): Controls the diversity of the top-k sampling. Higher values result in more diverse completions.
        top_p (float): Controls the diversity of the nucleus sampling. Higher values result in more diverse completions.
        min_p (float): Minimum probability for nucleus sampling. Lower values result in more focused completions.
        n_predict (int): Number of completions to predict. Set to -1 to use the default value.
        n_keep (int): Number of completions to keep. Set to 0 for all predictions.
        stream (bool): Enable streaming for long completions.
        additional_stop_sequences (List[str]): List of stop sequences to finish completion generation. The official stop sequences of the model get added automatically.
        tfs_z (float): Controls the temperature for top frequent sampling.
        typical_p (float): Typical probability for top frequent sampling.
        repeat_penalty (float): Penalty for repeating tokens in completions.
        repeat_last_n (int): Number of tokens to consider for repeat penalty.
        penalize_nl (bool): Enable penalizing newlines in completions.
        presence_penalty (float): Penalty for presence of certain tokens.
        frequency_penalty (float): Penalty based on token frequency.
        penalty_prompt (Union[None, str, List[int]]): Prompts to apply penalty for certain tokens.
        mirostat_mode (int): Mirostat level.
        mirostat_tau (float): Mirostat temperature.
        mirostat_eta (float): Mirostat eta parameter.
        seed (int): Seed for randomness. Set to -1 for no seed.
        ignore_eos (bool): Ignore end-of-sequence token.
    Methods:
        save(file_path: str): Save the settings to a file.
        load_from_file(file_path: str) -> LlamaCppServerGenerationSettings: Load the settings from a file.
        load_from_dict(settings: dict) -> LlamaCppServerGenerationSettings: Load the settings from a dictionary.
        as_dict() -> dict: Convert the settings to a dictionary.

    """

    temperature: float = 0.8
    top_k: int = 40
    top_p: float = 0.95
    min_p: float = 0.05
    n_predict: int = -1
    n_keep: int = 0
    stream: bool = True
    additional_stop_sequences: List[str] = None
    tfs_z: float = 1.0
    typical_p: float = 1.0
    repeat_penalty: float = 1.1
    repeat_last_n: int = -1
    penalize_nl: bool = False
    presence_penalty: float = 0.0
    frequency_penalty: float = 0.0
    penalty_prompt: Union[None, str, List[int]] = None
    mirostat_mode: int = 0
    mirostat_tau: float = 5.0
    mirostat_eta: float = 0.1
    cache_prompt: bool = True
    seed: int = -1
    ignore_eos: bool = False
    samplers: List[str] = None

    def get_provider_identifier(self) -> LlmProviderId:
        return LlmProviderId.llama_cpp_server

    def get_additional_stop_sequences(self) -> List[str]:
        if self.additional_stop_sequences is None:
            self.additional_stop_sequences = []
        return self.additional_stop_sequences

    def add_additional_stop_sequences(self, sequences: List[str]):
        if self.additional_stop_sequences is None:
            self.additional_stop_sequences = []
        self.additional_stop_sequences.extend(sequences)

    def is_streaming(self):
        return self.stream

    @staticmethod
    def load_from_dict(settings: dict) -> "LlamaCppSamplingSettings":
        """
        Load the settings from a dictionary.

        Args:
            settings (dict): The dictionary containing the settings.

        Returns:
            LlamaCppSamplingSettings: The loaded settings.
        """
        return LlamaCppSamplingSettings(**settings)

    def as_dict(self) -> dict:
        """
        Convert the settings to a dictionary.

        Returns:
            dict: The dictionary representation of the settings.
        """
        return self.__dict__

`load_from_dict(settings)` `staticmethod`

Load the settings from a dictionary.

Parameters:

settings (dict) –

The dictionary containing the settings.

Returns:

LlamaCppSamplingSettings ( LlamaCppSamplingSettings ) –

The loaded settings.

Source code in llama_cpp_agent/providers/llama_cpp_server.py

@staticmethod
def load_from_dict(settings: dict) -> "LlamaCppSamplingSettings":
    """
    Load the settings from a dictionary.

    Args:
        settings (dict): The dictionary containing the settings.

    Returns:
        LlamaCppSamplingSettings: The loaded settings.
    """
    return LlamaCppSamplingSettings(**settings)

`as_dict()`

Convert the settings to a dictionary.

Returns:

dict ( dict ) –

The dictionary representation of the settings.

Source code in llama_cpp_agent/providers/llama_cpp_server.py

def as_dict(self) -> dict:
    """
    Convert the settings to a dictionary.

    Returns:
        dict: The dictionary representation of the settings.
    """
    return self.__dict__

Llama Cpp Python

`llama_cpp_agent.providers.llama_cpp_python`

`LlamaCppPythonSamplingSettings` `dataclass`

Bases: LlmSamplingSettings

Settings for generating completions using the Llama.cpp server.

Parameters:

temperature (float, default: 0.8 ) –

Controls the randomness of the generated completions. Higher values make the output more random.
top_k (int, default: 40 ) –

Controls the diversity of the top-k sampling. Higher values result in more diverse completions.
top_p (float, default: 0.95 ) –

Controls the diversity of the nucleus sampling. Higher values result in more diverse completions.
min_p (float, default: 0.05 ) –

Minimum probability for nucleus sampling. Lower values result in more focused completions.
max_tokens (int, default: -1 ) –

Number of max tokens to generate.
stream (bool, default: False ) –

Enable streaming for long completions.
additional_stop_sequences (List[str], default: None ) –

List of stop sequences to finish completion generation. The official stop sequences of the model get added automatically.
tfs_z (float, default: 1.0 ) –

Controls the temperature for top frequent sampling.
typical_p (float, default: 1.0 ) –

Typical probability for top frequent sampling.
repeat_penalty (float, default: 1.1 ) –

Penalty for repeating tokens in completions.
presence_penalty (float, default: 0.0 ) –

Penalty for presence of certain tokens.
frequency_penalty (float, default: 0.0 ) –

Penalty based on token frequency.
mirostat_mode (int, default: 0 ) –

Mirostat level.
mirostat_tau (float, default: 5.0 ) –

Mirostat temperature.
mirostat_eta (float, default: 0.1 ) –

Mirostat eta parameter.
seed (int, default: -1 ) –

Seed for randomness. Set to -1 for no seed.

Attributes:

temperature (float) –

Controls the randomness of the generated completions. Higher values make the output more random.
top_k (int) –

Controls the diversity of the top-k sampling. Higher values result in more diverse completions.
top_p (float) –

Controls the diversity of the nucleus sampling. Higher values result in more diverse completions.
min_p (float) –

Minimum probability for nucleus sampling. Lower values result in more focused completions.
max_tokens (int) –

Number of max tokens to generate.
stream (bool) –

Enable streaming for long completions.
additional_stop_sequences (List[str]) –

List of stop sequences to finish completion generation. The official stop sequences of the model get added automatically.
tfs_z (float) –

Controls the temperature for top frequent sampling.
typical_p (float) –

Typical probability for top frequent sampling.
repeat_penalty (float) –

Penalty for repeating tokens in completions.
presence_penalty (float) –

Penalty for presence of certain tokens.
frequency_penalty (float) –

Penalty based on token frequency.
mirostat_mode (int) –

Mirostat level.
mirostat_tau (float) –

Mirostat temperature.
mirostat_eta (float) –

Mirostat eta parameter.
seed (int) –

Seed for randomness. Set to -1 for no seed.

Methods: save(file_path: str): Save the settings to a file. load_from_file(file_path: str) -> LlamaCppServerGenerationSettings: Load the settings from a file. load_from_dict(settings: dict) -> LlamaCppServerGenerationSettings: Load the settings from a dictionary. as_dict() -> dict: Convert the settings to a dictionary.

Source code in llama_cpp_agent/providers/llama_cpp_python.py

@dataclass
class LlamaCppPythonSamplingSettings(LlmSamplingSettings):
    """
    Settings for generating completions using the Llama.cpp server.

    Args:
        temperature (float): Controls the randomness of the generated completions. Higher values make the output more random.
        top_k (int): Controls the diversity of the top-k sampling. Higher values result in more diverse completions.
        top_p (float): Controls the diversity of the nucleus sampling. Higher values result in more diverse completions.
        min_p (float): Minimum probability for nucleus sampling. Lower values result in more focused completions.
        max_tokens (int): Number of max tokens to generate.
        stream (bool): Enable streaming for long completions.
        additional_stop_sequences (List[str]): List of stop sequences to finish completion generation. The official stop sequences of the model get added automatically.
        tfs_z (float): Controls the temperature for top frequent sampling.
        typical_p (float): Typical probability for top frequent sampling.
        repeat_penalty (float): Penalty for repeating tokens in completions.
        presence_penalty (float): Penalty for presence of certain tokens.
        frequency_penalty (float): Penalty based on token frequency.
        mirostat_mode (int): Mirostat level.
        mirostat_tau (float): Mirostat temperature.
        mirostat_eta (float): Mirostat eta parameter.
        seed (int): Seed for randomness. Set to -1 for no seed.


    Attributes:
        temperature (float): Controls the randomness of the generated completions. Higher values make the output more random.
        top_k (int): Controls the diversity of the top-k sampling. Higher values result in more diverse completions.
        top_p (float): Controls the diversity of the nucleus sampling. Higher values result in more diverse completions.
        min_p (float): Minimum probability for nucleus sampling. Lower values result in more focused completions.
        max_tokens (int): Number of max tokens to generate.
        stream (bool): Enable streaming for long completions.
        additional_stop_sequences (List[str]): List of stop sequences to finish completion generation. The official stop sequences of the model get added automatically.
        tfs_z (float): Controls the temperature for top frequent sampling.
        typical_p (float): Typical probability for top frequent sampling.
        repeat_penalty (float): Penalty for repeating tokens in completions.
        presence_penalty (float): Penalty for presence of certain tokens.
        frequency_penalty (float): Penalty based on token frequency.
        mirostat_mode (int): Mirostat level.
        mirostat_tau (float): Mirostat temperature.
        mirostat_eta (float): Mirostat eta parameter.
        seed (int): Seed for randomness. Set to -1 for no seed.
    Methods:
        save(file_path: str): Save the settings to a file.
        load_from_file(file_path: str) -> LlamaCppServerGenerationSettings: Load the settings from a file.
        load_from_dict(settings: dict) -> LlamaCppServerGenerationSettings: Load the settings from a dictionary.
        as_dict() -> dict: Convert the settings to a dictionary.

    """

    temperature: float = 0.8
    top_k: int = 40
    top_p: float = 0.95
    min_p: float = 0.05
    max_tokens: int = -1
    stream: bool = False
    additional_stop_sequences: List[str] = None
    tfs_z: float = 1.0
    typical_p: float = 1.0
    repeat_penalty: float = 1.1
    presence_penalty: float = 0.0
    frequency_penalty: float = 0.0
    mirostat_mode: int = 0
    mirostat_tau: float = 5.0
    mirostat_eta: float = 0.1
    seed: int = -1

    def get_provider_identifier(self) -> LlmProviderId:
        return LlmProviderId.llama_cpp_server

    def get_additional_stop_sequences(self) -> List[str]:
        if self.additional_stop_sequences is None:
            self.additional_stop_sequences = []
        return self.additional_stop_sequences

    def add_additional_stop_sequences(self, sequences: List[str]):
        if self.additional_stop_sequences is None:
            self.additional_stop_sequences = []
        self.additional_stop_sequences.extend(sequences)

    def is_streaming(self):
        return self.stream

    @staticmethod
    def load_from_dict(settings: dict) -> "LlamaCppPythonSamplingSettings":
        """
        Load the settings from a dictionary.

        Args:
            settings (dict): The dictionary containing the settings.

        Returns:
            LlamaCppPythonSamplingSettings: The loaded settings.
        """
        return LlamaCppPythonSamplingSettings(**settings)

    def as_dict(self) -> dict:
        """
        Convert the settings to a dictionary.

        Returns:
            dict: The dictionary representation of the settings.
        """
        return self.__dict__

`load_from_dict(settings)` `staticmethod`

Load the settings from a dictionary.

Parameters:

settings (dict) –

The dictionary containing the settings.

Returns:

LlamaCppPythonSamplingSettings ( LlamaCppPythonSamplingSettings ) –

The loaded settings.

Source code in llama_cpp_agent/providers/llama_cpp_python.py

@staticmethod
def load_from_dict(settings: dict) -> "LlamaCppPythonSamplingSettings":
    """
    Load the settings from a dictionary.

    Args:
        settings (dict): The dictionary containing the settings.

    Returns:
        LlamaCppPythonSamplingSettings: The loaded settings.
    """
    return LlamaCppPythonSamplingSettings(**settings)

`as_dict()`

Convert the settings to a dictionary.

Returns:

dict ( dict ) –

The dictionary representation of the settings.

Source code in llama_cpp_agent/providers/llama_cpp_python.py

def as_dict(self) -> dict:
    """
    Convert the settings to a dictionary.

    Returns:
        dict: The dictionary representation of the settings.
    """
    return self.__dict__

TGI - Server

`llama_cpp_agent.providers.tgi_server`

`TGIServerSamplingSettings` `dataclass`

Bases: LlmSamplingSettings

TGIServerSamplingSettings dataclass

Source code in llama_cpp_agent/providers/tgi_server.py

@dataclass
class TGIServerSamplingSettings(LlmSamplingSettings):
    """
    TGIServerSamplingSettings dataclass
    """

    best_of: Optional[int] = field(default=None, metadata={"minimum": 0})
    decoder_input_details: bool = False
    details: bool = True
    do_sample: bool = False
    frequency_penalty: Optional[float] = field(
        default=None, metadata={"exclusiveMinimum": -2}
    )
    grammar: Optional[dict] = None
    max_new_tokens: Optional[int] = field(default=None, metadata={"minimum": 0})
    repetition_penalty: Optional[float] = field(
        default=None, metadata={"exclusiveMinimum": 0}
    )
    return_full_text: Optional[bool] = field(default=None)
    seed: Optional[int] = field(default=None, metadata={"minimum": 0})
    stop: Optional[List[str]] = field(default_factory=list)
    temperature: Optional[float] = field(default=None, metadata={"exclusiveMinimum": 0})
    top_k: Optional[int] = field(default=None, metadata={"exclusiveMinimum": 0})
    top_n_tokens: Optional[int] = field(
        default=None, metadata={"minimum": 0, "exclusiveMinimum": 0}
    )
    top_p: Optional[float] = field(
        default=None, metadata={"maximum": 1, "exclusiveMinimum": 0}
    )
    truncate: Optional[int] = field(default=None, metadata={"minimum": 0})
    typical_p: Optional[float] = field(
        default=None, metadata={"maximum": 1, "exclusiveMinimum": 0}
    )
    watermark: bool = False
    stream: bool = False

    def get_provider_identifier(self) -> LlmProviderId:
        return LlmProviderId.tgi_server

    def get_additional_stop_sequences(self) -> Union[List[str], None]:
        return self.stop

    def add_additional_stop_sequences(self, sequences: List[str]):
        self.stop.extend(sequences)

    def is_streaming(self):
        return self.stream

    @staticmethod
    def load_from_dict(settings: dict) -> "TGIServerSamplingSettings":
        """
        Load the settings from a dictionary.

        Args:
            settings (dict): The dictionary containing the settings.

        Returns:
            LlamaCppSamplingSettings: The loaded settings.
        """
        return TGIServerSamplingSettings(**settings)

    def as_dict(self) -> dict:
        """
        Convert the settings to a dictionary.

        Returns:
            dict: The dictionary representation of the settings.
        """
        return self.__dict__

`load_from_dict(settings)` `staticmethod`

Load the settings from a dictionary.

Parameters:

settings (dict) –

The dictionary containing the settings.

Returns:

LlamaCppSamplingSettings ( TGIServerSamplingSettings ) –

The loaded settings.

Source code in llama_cpp_agent/providers/tgi_server.py

@staticmethod
def load_from_dict(settings: dict) -> "TGIServerSamplingSettings":
    """
    Load the settings from a dictionary.

    Args:
        settings (dict): The dictionary containing the settings.

    Returns:
        LlamaCppSamplingSettings: The loaded settings.
    """
    return TGIServerSamplingSettings(**settings)

`as_dict()`

Convert the settings to a dictionary.

Returns:

dict ( dict ) –

The dictionary representation of the settings.

Source code in llama_cpp_agent/providers/tgi_server.py

def as_dict(self) -> dict:
    """
    Convert the settings to a dictionary.

    Returns:
        dict: The dictionary representation of the settings.
    """
    return self.__dict__

vllm - Server

`llama_cpp_agent.providers.vllm_server`

`VLLMServerSamplingSettings` `dataclass`

Bases: LlmSamplingSettings

VLLMServerSamplingSettings dataclass

Source code in llama_cpp_agent/providers/vllm_server.py

@dataclass
class VLLMServerSamplingSettings(LlmSamplingSettings):
    """
    VLLMServerSamplingSettings dataclass
    """

    best_of: Optional[int] = None
    use_beam_search = False
    top_k: float = -1
    top_p: float = 1
    min_p: float = 0.0
    temperature: float = 0.7
    max_tokens: int = 16
    repetition_penalty: Optional[float] = 1.0
    length_penalty: Optional[float] = 1.0
    early_stopping: Optional[bool] = False
    ignore_eos: Optional[bool] = False
    min_tokens: Optional[int] = 0
    stop_token_ids: Optional[List[int]] = field(default_factory=list)
    skip_special_tokens: Optional[bool] = True
    spaces_between_special_tokens: Optional[bool] = True
    stream: bool = False

    def get_provider_identifier(self) -> LlmProviderId:
        return LlmProviderId.vllm_server

    def get_additional_stop_sequences(self) -> Union[List[str], None]:
        return None

    def add_additional_stop_sequences(self, sequences: List[str]):
        pass

    def is_streaming(self):
        return self.stream

    @staticmethod
    def load_from_dict(settings: dict) -> "VLLMServerSamplingSettings":
        """
        Load the settings from a dictionary.

        Args:
            settings (dict): The dictionary containing the settings.

        Returns:
            LlamaCppSamplingSettings: The loaded settings.
        """
        return VLLMServerSamplingSettings(**settings)

    def as_dict(self) -> dict:
        """
        Convert the settings to a dictionary.

        Returns:
            dict: The dictionary representation of the settings.
        """
        return self.__dict__

`load_from_dict(settings)` `staticmethod`

Load the settings from a dictionary.

Parameters:

settings (dict) –

The dictionary containing the settings.

Returns:

LlamaCppSamplingSettings ( VLLMServerSamplingSettings ) –

The loaded settings.

Source code in llama_cpp_agent/providers/vllm_server.py

@staticmethod
def load_from_dict(settings: dict) -> "VLLMServerSamplingSettings":
    """
    Load the settings from a dictionary.

    Args:
        settings (dict): The dictionary containing the settings.

    Returns:
        LlamaCppSamplingSettings: The loaded settings.
    """
    return VLLMServerSamplingSettings(**settings)

`as_dict()`

Convert the settings to a dictionary.

Returns:

dict ( dict ) –

The dictionary representation of the settings.

Source code in llama_cpp_agent/providers/vllm_server.py

def as_dict(self) -> dict:
    """
    Convert the settings to a dictionary.

    Returns:
        dict: The dictionary representation of the settings.
    """
    return self.__dict__

Provider

Llama Cpp Server

llama_cpp_agent.providers.llama_cpp_server

LlamaCppSamplingSettings dataclass

load_from_dict(settings) staticmethod

as_dict()

Llama Cpp Python

llama_cpp_agent.providers.llama_cpp_python

LlamaCppPythonSamplingSettings dataclass

load_from_dict(settings) staticmethod

as_dict()

TGI - Server

llama_cpp_agent.providers.tgi_server

TGIServerSamplingSettings dataclass

load_from_dict(settings) staticmethod

as_dict()

vllm - Server

llama_cpp_agent.providers.vllm_server

VLLMServerSamplingSettings dataclass

load_from_dict(settings) staticmethod

as_dict()

`llama_cpp_agent.providers.llama_cpp_server`

`LlamaCppSamplingSettings` `dataclass`

`load_from_dict(settings)` `staticmethod`

`as_dict()`

`llama_cpp_agent.providers.llama_cpp_python`

`LlamaCppPythonSamplingSettings` `dataclass`

`load_from_dict(settings)` `staticmethod`

`as_dict()`

`llama_cpp_agent.providers.tgi_server`

`TGIServerSamplingSettings` `dataclass`

`load_from_dict(settings)` `staticmethod`

`as_dict()`

`llama_cpp_agent.providers.vllm_server`

`VLLMServerSamplingSettings` `dataclass`

`load_from_dict(settings)` `staticmethod`

`as_dict()`