Using RetrieveChat with Qdrant for Retrieve Augmented Code Generation and Question Answering

Qdrant is a high-performance vector search engine/database.

This notebook demonstrates the usage of Qdrant for RAG, based on agentchat_RetrieveChat.ipynb.

RetrieveChat is a conversational system for retrieve augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM’s training dataset. RetrieveChat uses the AssistantAgent and RetrieveUserProxyAgent, which is similar to the usage of AssistantAgent and UserProxyAgent in other notebooks (e.g., Automated Task Solving with Code Generation, Execution & Debugging).

We’ll demonstrate usage of RetrieveChat with Qdrant for code generation and question answering w/ human feedback.

Requirements

Some extra dependencies are needed for this notebook, which can be installed via pip:

pip install "autogen[retrievechat-qdrant]" "flaml[automl]"

For more information, please refer to the installation guide.

%pip install "autogen[retrievechat-qdrant]" "flaml[automl]" -q

Note: you may need to restart the kernel to use updated packages.

Set your API Endpoint

The config_list_from_json function loads a list of configurations from an environment variable or a json file.

from qdrant_client import QdrantClient
from sentence_transformers import SentenceTransformer

import autogen
from autogen import AssistantAgent
from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent

# Accepted file formats for that can be stored in
# a vector database instance
from autogen.retrieve_utils import TEXT_FORMATS

config_list = autogen.config_list_from_json("OAI_CONFIG_LIST")

assert len(config_list) > 0
print("models to use: ", [config_list[i]["model"] for i in range(len(config_list))])

models to use:  ['gpt4-1106-preview', 'gpt-4o', 'gpt-35-turbo', 'gpt-35-turbo-0613']

tip

Learn more about configuring LLMs for agents here.

print("Accepted file formats for `docs_path`:")
print(TEXT_FORMATS)

Accepted file formats for `docs_path`:
['rtf', 'jsonl', 'xml', 'json', 'md', 'rst', 'docx', 'msg', 'pdf', 'log', 'xlsx', 'org', 'txt', 'csv', 'pptx', 'tsv', 'yml', 'epub', 'yaml', 'ppt', 'htm', 'doc', 'odt', 'html']

Construct agents for RetrieveChat

We start by initializing the AssistantAgent and RetrieveUserProxyAgent. The system message needs to be set to “You are a helpful assistant.” for AssistantAgent. The detailed instructions are given in the user message. Later we will use the RetrieveUserProxyAgent.generate_init_prompt to combine the instructions and a retrieval augmented generation task for an initial prompt to be sent to the LLM assistant.

You can find the list of all the embedding models supported by Qdrant here.

# 1. create an AssistantAgent instance named "assistant"
assistant = AssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant.",
    llm_config={
        "timeout": 600,
        "cache_seed": 42,
        "config_list": config_list,
    },
)

# Optionally create embedding function object
sentence_transformer_ef = SentenceTransformer("all-distilroberta-v1").encode
client = QdrantClient(":memory:")

# 2. create the RetrieveUserProxyAgent instance named "ragproxyagent"
# Refer to https://ag2ai.github.io/ag2/docs/reference/agentchat/contrib/retrieve_user_proxy_agent
# and https://ag2ai.github.io/ag2/docs/reference/agentchat/contrib/vectordb/qdrant
# for more information on the RetrieveUserProxyAgent and QdrantVectorDB
ragproxyagent = RetrieveUserProxyAgent(
    name="ragproxyagent",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    retrieve_config={
        "task": "code",
        "docs_path": [
            "https://raw.githubusercontent.com/ag2ai/flaml/main/README.md",
            "https://raw.githubusercontent.com/ag2ai/FLAML/main/website/docs/Research.md",
        ],  # change this to your own path, such as https://raw.githubusercontent.com/ag2ai/ag2/main/README.md
        "chunk_token_size": 2000,
        "model": config_list[0]["model"],
        "db_config": {"client": client},
        "vector_db": "qdrant",  # qdrant database
        "get_or_create": True,  # set to False if you don't want to reuse an existing collection
        "overwrite": True,  # set to True if you want to overwrite an existing collection
        "embedding_function": sentence_transformer_ef,  # If left out fastembed "BAAI/bge-small-en-v1.5" will be used
    },
    code_execution_config=False,
)

Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]

### Example 1

Use RetrieveChat to answer a question and ask for human-in-loop feedbacks.

Problem: Is there a function named tune_automl in FLAML?

# reset the assistant. Always reset the assistant before starting a new conversation.
assistant.reset()

qa_problem = "Is there a function called tune_automl?"
chat_results = ragproxyagent.initiate_chat(assistant, message=ragproxyagent.message_generator, problem=qa_problem)

Trying to create collection.
VectorDB returns doc_ids:  [['987f060a-4399-b91a-0e51-51b6165ea5bb', '0ecd7192-3761-7d6f-9151-5ff504ca740b', 'ddbaaafc-abdd-30b4-eecd-ec2c32818952']]
Adding content of doc 987f060a-4399-b91a-0e51-51b6165ea5bb to context.
ragproxyagent (to assistant):

You're a retrieve augmented coding assistant. You answer user's questions based on your own knowledge and the
context provided by the user.
If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.
For code generation, you must obey the following rules:
Rule 1. You MUST NOT install any packages because all the packages needed are already installed.
Rule 2. You must follow the formats below to write your code:
```language
# your code
```

User's question is: Is there a function called tune_automl?

Context is: [![PyPI version](https://badge.fury.io/py/FLAML.svg)](https://badge.fury.io/py/FLAML)
![Conda version](https://img.shields.io/conda/vn/conda-forge/flaml)
[![Build](https://github.com/microsoft/FLAML/actions/workflows/python-package.yml/badge.svg)](https://github.com/microsoft/FLAML/actions/workflows/python-package.yml)
![Python Version](https://img.shields.io/badge/3.8%20%7C%203.9%20%7C%203.10-blue)
[![Downloads](https://pepy.tech/badge/flaml)](https://pepy.tech/project/flaml)
[![](https://img.shields.io/discord/1025786666260111483?logo=discord&style=flat)](https://discord.gg/Cppx2vSPVP)

<!-- [![Join the chat at https://gitter.im/FLAMLer/community](https://badges.gitter.im/FLAMLer/community.svg)](https://gitter.im/FLAMLer/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) -->

# A Fast Library for Automated Machine Learning & Tuning

<p align="center">
    <img src="https://github.com/microsoft/FLAML/blob/main/website/static/img/flaml.svg"  width=200>
    <br>
</p>

:fire: Heads-up: We have migrated [AutoGen](https://ag2ai.github.io/autogen/) into a dedicated [github repository](https://github.com/ag2ai/ag2). Alongside this move, we have also launched a dedicated [Discord](https://discord.gg/pAbnFJrkgZ) server and a [website](https://ag2ai.github.io/autogen/) for comprehensive documentation.

:fire: The automated multi-agent chat framework in [AutoGen](https://ag2ai.github.io/autogen/) is in preview from v2.0.0.

:fire: FLAML is highlighted in OpenAI's [cookbook](https://github.com/openai/openai-cookbook#related-resources-from-around-the-web).

:fire: [autogen](https://ag2ai.github.io/autogen/) is released with support for ChatGPT and GPT-4, based on [Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference](https://arxiv.org/abs/2303.04673).

:fire: FLAML supports Code-First AutoML & Tuning – Private Preview in [Microsoft Fabric Data Science](https://learn.microsoft.com/en-us/fabric/data-science/).

## What is FLAML

FLAML is a lightweight Python library for efficient automation of machine
learning and AI operations. It automates workflow based on large language models, machine learning models, etc.
and optimizes their performance.

- FLAML enables building next-gen GPT-X applications based on multi-agent conversations with minimal effort. It simplifies the orchestration, automation and optimization of a complex GPT-X workflow. It maximizes the performance of GPT-X models and augments their weakness.
- For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend. Users can find their desired customizability from a smooth range.
- It supports fast and economical automatic tuning (e.g., inference hyperparameters for foundation models, configurations in MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations), capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping.

FLAML is powered by a series of [research studies](https://microsoft.github.io/FLAML/docs/Research/) from Microsoft Research and collaborators such as Penn State University, Stevens Institute of Technology, University of Washington, and University of Waterloo.

FLAML has a .NET implementation in [ML.NET](http://dot.net/ml), an open-source, cross-platform machine learning framework for .NET.

## Installation

FLAML requires **Python version >= 3.8**. It can be installed from pip:

```bash
pip install flaml
```

Minimal dependencies are installed without extra options. You can install extra options based on the feature you need. For example, use the following to install the dependencies needed by the [`autogen`](https://ag2ai.github.io/autogen/) package.

```bash
pip install "flaml[autogen]"
```

Find more options in [Installation](https://microsoft.github.io/FLAML/docs/Installation).
Each of the [`notebook examples`](https://github.com/microsoft/FLAML/tree/main/notebook) may require a specific option to be installed.

## Quickstart

- (New) The [autogen](https://ag2ai.github.io/autogen/) package enables the next-gen GPT-X applications with a generic multi-agent conversation framework.
  It offers customizable and conversable agents which integrate LLMs, tools and human.
  By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For example,

```python
from flaml import autogen

assistant = autogen.AssistantAgent("assistant")
user_proxy = autogen.UserProxyAgent("user_proxy")
user_proxy.initiate_chat(
    assistant,
    message="Show me the YTD gain of 10 largest technology companies as of today.",
)
# This initiates an automated chat between the two agents to solve the task
```

Autogen also helps maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4. It offers a drop-in replacement of `openai.Completion` or `openai.ChatCompletion` with powerful functionalites like tuning, caching, templating, filtering. For example, you can optimize generations by LLM with your own tuning data, success metrics and budgets.

```python
# perform tuning
config, analysis = autogen.Completion.tune(
    data=tune_data,
    metric="success",
    mode="max",
    eval_func=eval_func,
    inference_budget=0.05,
    optimization_budget=3,
    num_samples=-1,
)
# perform inference for a test instance
response = autogen.Completion.create(context=test_instance, **config)
```

- With three lines of code, you can start using this economical and fast
  AutoML engine as a [scikit-learn style estimator](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML).

```python
from flaml import AutoML

automl = AutoML()
automl.fit(X_train, y_train, task="classification")
```

- You can restrict the learners and use FLAML as a fast hyperparameter tuning
  tool for XGBoost, LightGBM, Random Forest etc. or a [customized learner](https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#estimator-and-search-space).

```python
automl.fit(X_train, y_train, task="classification", estimator_list=["lgbm"])
```

- You can also run generic hyperparameter tuning for a [custom function](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function).

```python
from flaml import tune
tune.run(evaluation_function, config={…}, low_cost_partial_config={…}, time_budget_s=3600)
```

- [Zero-shot AutoML](https://microsoft.github.io/FLAML/docs/Use-Cases/Zero-Shot-AutoML) allows using the existing training API from lightgbm, xgboost etc. while getting the benefit of AutoML in choosing high-performance hyperparameter configurations per task.

```python
from flaml.default import LGBMRegressor

# Use LGBMRegressor in the same way as you use lightgbm.LGBMRegressor.
estimator = LGBMRegressor()
# The hyperparameters are automatically set according to the training data.
estimator.fit(X_train, y_train)
```

## Documentation

You can find a detailed documentation about FLAML [here](https://microsoft.github.io/FLAML/).

In addition, you can find:

- [Research](https://microsoft.github.io/FLAML/docs/Research) and [blogposts](https://microsoft.github.io/FLAML/blog) around FLAML.

- [Discord](https://discord.gg/Cppx2vSPVP).

- [Contributing guide](https://microsoft.github.io/FLAML/docs/Contribute).

- ML.NET documentation and tutorials for [Model Builder](https://learn.microsoft.com/dotnet/machine-learning/tutorials/predict-prices-with-model-builder), [ML.NET CLI](https://learn.microsoft.com/dotnet/machine-learning/tutorials/sentiment-analysis-cli), and [AutoML API](https://learn.microsoft.com/dotnet/machine-learning/how-to-guides/how-to-use-the-automl-api).

## Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit <https://cla.opensource.microsoft.com>.



--------------------------------------------------------------------------------
assistant (to ragproxyagent):

No, there is no function called `tune_automl` specifically mentioned in the context provided. However, FLAML does offer general hyperparameter tuning capabilities which could be related to this. In the context of FLAML, there is a generic function called `tune.run()` that can be used for hyperparameter tuning.

Here's a short example of how to use FLAML's tune for a user-defined function based on the given context:

```python
from flaml import tune

def evaluation_function(config):
    # evaluation logic that returns a metric score
    ...

# define the search space for hyperparameters
config_search_space = {
    'max_depth': tune.randint(lower=3, upper=10),
    'learning_rate': tune.loguniform(lower=1e-4, upper=1e-1),
}

# run hyperparameter tuning
tune.run(
    evaluation_function,
    config=config_search_space,
    low_cost_partial_config={'max_depth': 3},
    time_budget_s=3600
)
```

Please note that if you are referring to a different kind of function or use case, you might need to specify more details or check the official documentation or source code of the FLAML library.

--------------------------------------------------------------------------------
ragproxyagent (to assistant):



--------------------------------------------------------------------------------
assistant (to ragproxyagent):

UPDATE CONTEXT

--------------------------------------------------------------------------------
Updating context and resetting conversation.
Adding content of doc 0ecd7192-3761-7d6f-9151-5ff504ca740b to context.
Adding content of doc ddbaaafc-abdd-30b4-eecd-ec2c32818952 to context.
ragproxyagent (to assistant):

You're a retrieve augmented coding assistant. You answer user's questions based on your own knowledge and the
context provided by the user.
If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.
For code generation, you must obey the following rules:
Rule 1. You MUST NOT install any packages because all the packages needed are already installed.
Rule 2. You must follow the formats below to write your code:
```language
# your code
```

User's question is: Is there a function called tune_automl?

Context is: # Research

For technical details, please check our research publications.

- [FLAML: A Fast and Lightweight AutoML Library](https://www.microsoft.com/en-us/research/publication/flaml-a-fast-and-lightweight-automl-library/). Chi Wang, Qingyun Wu, Markus Weimer, Erkang Zhu. MLSys 2021.

```bibtex
@inproceedings{wang2021flaml,
    title={FLAML: A Fast and Lightweight AutoML Library},
    author={Chi Wang and Qingyun Wu and Markus Weimer and Erkang Zhu},
    year={2021},
    booktitle={MLSys},
}
```

- [Frugal Optimization for Cost-related Hyperparameters](https://arxiv.org/abs/2005.01571). Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.

```bibtex
@inproceedings{wu2021cfo,
    title={Frugal Optimization for Cost-related Hyperparameters},
    author={Qingyun Wu and Chi Wang and Silu Huang},
    year={2021},
    booktitle={AAAI},
}
```

- [Economical Hyperparameter Optimization With Blended Search Strategy](https://www.microsoft.com/en-us/research/publication/economical-hyperparameter-optimization-with-blended-search-strategy/). Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.

```bibtex
@inproceedings{wang2021blendsearch,
    title={Economical Hyperparameter Optimization With Blended Search Strategy},
    author={Chi Wang and Qingyun Wu and Silu Huang and Amin Saied},
    year={2021},
    booktitle={ICLR},
}
```

- [An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models](https://aclanthology.org/2021.acl-long.178.pdf). Susan Xueqing Liu, Chi Wang. ACL 2021.

```bibtex
@inproceedings{liuwang2021hpolm,
    title={An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models},
    author={Susan Xueqing Liu and Chi Wang},
    year={2021},
    booktitle={ACL},
}
```

- [ChaCha for Online AutoML](https://www.microsoft.com/en-us/research/publication/chacha-for-online-automl/). Qingyun Wu, Chi Wang, John Langford, Paul Mineiro and Marco Rossi. ICML 2021.

```bibtex
@inproceedings{wu2021chacha,
    title={ChaCha for Online AutoML},
    author={Qingyun Wu and Chi Wang and John Langford and Paul Mineiro and Marco Rossi},
    year={2021},
    booktitle={ICML},
}
```

- [Fair AutoML](https://arxiv.org/abs/2111.06495). Qingyun Wu, Chi Wang. ArXiv preprint arXiv:2111.06495 (2021).

```bibtex
@inproceedings{wuwang2021fairautoml,
    title={Fair AutoML},
    author={Qingyun Wu and Chi Wang},
    year={2021},
    booktitle={ArXiv preprint arXiv:2111.06495},
}
```

- [Mining Robust Default Configurations for Resource-constrained AutoML](https://arxiv.org/abs/2202.09927). Moe Kayali, Chi Wang. ArXiv preprint arXiv:2202.09927 (2022).

```bibtex
@inproceedings{kayaliwang2022default,
    title={Mining Robust Default Configurations for Resource-constrained AutoML},
    author={Moe Kayali and Chi Wang},
    year={2022},
    booktitle={ArXiv preprint arXiv:2202.09927},
}
```

- [Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives](https://openreview.net/forum?id=0Ij9_q567Ma). Shaokun Zhang, Feiran Jia, Chi Wang, Qingyun Wu. ICLR 2023 (notable-top-5%).

```bibtex
@inproceedings{zhang2023targeted,
    title={Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives},
    author={Shaokun Zhang and Feiran Jia and Chi Wang and Qingyun Wu},
    booktitle={International Conference on Learning Representations},
    year={2023},
    url={https://openreview.net/forum?id=0Ij9_q567Ma},
}
```

- [Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference](https://arxiv.org/abs/2303.04673). Chi Wang, Susan Xueqing Liu, Ahmed H. Awadallah. ArXiv preprint arXiv:2303.04673 (2023).

```bibtex
@inproceedings{wang2023EcoOptiGen,
    title={Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference},
    author={Chi Wang and Susan Xueqing Liu and Ahmed H. Awadallah},
    year={2023},
    booktitle={ArXiv preprint arXiv:2303.04673},
}
```

- [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337). Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang. ArXiv preprint arXiv:2306.01337 (2023).

```bibtex
@inproceedings{wu2023empirical,
    title={An Empirical Study on Challenging Math Problem Solving with GPT-4},
    author={Yiran Wu and Feiran Jia and Shaokun Zhang and Hangyu Li and Erkang Zhu and Yue Wang and Yin Tat Lee and Richard Peng and Qingyun Wu and Chi Wang},
    year={2023},
    booktitle={ArXiv preprint arXiv:2306.01337},
}
```
If you are new to GitHub [here](https://help.github.com/categories/collaborating-with-issues-and-pull-requests/) is a detailed help source on getting involved with development on GitHub.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.


ragproxyagent (to assistant):

You're a retrieve augmented coding assistant. You answer user's questions based on your own knowledge and the
context provided by the user.
If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.
For code generation, you must obey the following rules:
Rule 1. You MUST NOT install any packages because all the packages needed are already installed.
Rule 2. You must follow the formats below to write your code:
```language
# your code
```

User's question is: Is there a function called tune_automl?

Context is: # Research

For technical details, please check our research publications.

- [FLAML: A Fast and Lightweight AutoML Library](https://www.microsoft.com/en-us/research/publication/flaml-a-fast-and-lightweight-automl-library/). Chi Wang, Qingyun Wu, Markus Weimer, Erkang Zhu. MLSys 2021.

```bibtex
@inproceedings{wang2021flaml,
    title={FLAML: A Fast and Lightweight AutoML Library},
    author={Chi Wang and Qingyun Wu and Markus Weimer and Erkang Zhu},
    year={2021},
    booktitle={MLSys},
}
```

- [Frugal Optimization for Cost-related Hyperparameters](https://arxiv.org/abs/2005.01571). Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.

```bibtex
@inproceedings{wu2021cfo,
    title={Frugal Optimization for Cost-related Hyperparameters},
    author={Qingyun Wu and Chi Wang and Silu Huang},
    year={2021},
    booktitle={AAAI},
}
```

- [Economical Hyperparameter Optimization With Blended Search Strategy](https://www.microsoft.com/en-us/research/publication/economical-hyperparameter-optimization-with-blended-search-strategy/). Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.

```bibtex
@inproceedings{wang2021blendsearch,
    title={Economical Hyperparameter Optimization With Blended Search Strategy},
    author={Chi Wang and Qingyun Wu and Silu Huang and Amin Saied},
    year={2021},
    booktitle={ICLR},
}
```

- [An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models](https://aclanthology.org/2021.acl-long.178.pdf). Susan Xueqing Liu, Chi Wang. ACL 2021.

```bibtex
@inproceedings{liuwang2021hpolm,
    title={An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models},
    author={Susan Xueqing Liu and Chi Wang},
    year={2021},
    booktitle={ACL},
}
```

- [ChaCha for Online AutoML](https://www.microsoft.com/en-us/research/publication/chacha-for-online-automl/). Qingyun Wu, Chi Wang, John Langford, Paul Mineiro and Marco Rossi. ICML 2021.

```bibtex
@inproceedings{wu2021chacha,
    title={ChaCha for Online AutoML},
    author={Qingyun Wu and Chi Wang and John Langford and Paul Mineiro and Marco Rossi},
    year={2021},
    booktitle={ICML},
}
```

- [Fair AutoML](https://arxiv.org/abs/2111.06495). Qingyun Wu, Chi Wang. ArXiv preprint arXiv:2111.06495 (2021).

```bibtex
@inproceedings{wuwang2021fairautoml,
    title={Fair AutoML},
    author={Qingyun Wu and Chi Wang},
    year={2021},
    booktitle={ArXiv preprint arXiv:2111.06495},
}
```

- [Mining Robust Default Configurations for Resource-constrained AutoML](https://arxiv.org/abs/2202.09927). Moe Kayali, Chi Wang. ArXiv preprint arXiv:2202.09927 (2022).

```bibtex
@inproceedings{kayaliwang2022default,
    title={Mining Robust Default Configurations for Resource-constrained AutoML},
    author={Moe Kayali and Chi Wang},
    year={2022},
    booktitle={ArXiv preprint arXiv:2202.09927},
}
```

- [Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives](https://openreview.net/forum?id=0Ij9_q567Ma). Shaokun Zhang, Feiran Jia, Chi Wang, Qingyun Wu. ICLR 2023 (notable-top-5%).

```bibtex
@inproceedings{zhang2023targeted,
    title={Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives},
    author={Shaokun Zhang and Feiran Jia and Chi Wang and Qingyun Wu},
    booktitle={International Conference on Learning Representations},
    year={2023},
    url={https://openreview.net/forum?id=0Ij9_q567Ma},
}
```

- [Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference](https://arxiv.org/abs/2303.04673). Chi Wang, Susan Xueqing Liu, Ahmed H. Awadallah. ArXiv preprint arXiv:2303.04673 (2023).

```bibtex
@inproceedings{wang2023EcoOptiGen,
    title={Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference},
    author={Chi Wang and Susan Xueqing Liu and Ahmed H. Awadallah},
    year={2023},
    booktitle={ArXiv preprint arXiv:2303.04673},
}
```

- [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337). Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang. ArXiv preprint arXiv:2306.01337 (2023).

```bibtex
@inproceedings{wu2023empirical,
    title={An Empirical Study on Challenging Math Problem Solving with GPT-4},
    author={Yiran Wu and Feiran Jia and Shaokun Zhang and Hangyu Li and Erkang Zhu and Yue Wang and Yin Tat Lee and Richard Peng and Qingyun Wu and Chi Wang},
    year={2023},
    booktitle={ArXiv preprint arXiv:2306.01337},
}
```
If you are new to GitHub [here](https://help.github.com/categories/collaborating-with-issues-and-pull-requests/) is a detailed help source on getting involved with development on GitHub.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.



--------------------------------------------------------------------------------
assistant (to ragproxyagent):

UPDATE CONTEXT

--------------------------------------------------------------------------------
Updating context and resetting conversation.
VectorDB returns doc_ids:  [['987f060a-4399-b91a-0e51-51b6165ea5bb', '0ecd7192-3761-7d6f-9151-5ff504ca740b', 'ddbaaafc-abdd-30b4-eecd-ec2c32818952']]
VectorDB returns doc_ids:  [['987f060a-4399-b91a-0e51-51b6165ea5bb', '0ecd7192-3761-7d6f-9151-5ff504ca740b', 'ddbaaafc-abdd-30b4-eecd-ec2c32818952']]
VectorDB returns doc_ids:  [['987f060a-4399-b91a-0e51-51b6165ea5bb', '0ecd7192-3761-7d6f-9151-5ff504ca740b', 'ddbaaafc-abdd-30b4-eecd-ec2c32818952']]
VectorDB returns doc_ids:  [['987f060a-4399-b91a-0e51-51b6165ea5bb', '0ecd7192-3761-7d6f-9151-5ff504ca740b', 'ddbaaafc-abdd-30b4-eecd-ec2c32818952']]
No more context, will terminate.
ragproxyagent (to assistant):

TERMINATE

--------------------------------------------------------------------------------

2024-07-15 23:19:34,988 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - Found 3 chunks.
Model gpt4-1106-preview not found. Using cl100k_base encoding.
Model gpt4-1106-preview not found. Using cl100k_base encoding.
Model gpt4-1106-preview not found. Using cl100k_base encoding.
Model gpt4-1106-preview not found. Using cl100k_base encoding.

### Example 2

Use RetrieveChat to answer a question that is not related to code generation.

Problem: Who is the author of FLAML?

# reset the assistant. Always reset the assistant before starting a new conversation.
assistant.reset()

qa_problem = "Who is the author of FLAML?"
chat_results = ragproxyagent.initiate_chat(assistant, message=ragproxyagent.message_generator, problem=qa_problem)

Model gpt4-1106-preview not found. Using cl100k_base encoding.
Model gpt4-1106-preview not found. Using cl100k_base encoding.

VectorDB returns doc_ids:  [['0ecd7192-3761-7d6f-9151-5ff504ca740b', '987f060a-4399-b91a-0e51-51b6165ea5bb', 'ddbaaafc-abdd-30b4-eecd-ec2c32818952']]
Adding content of doc 0ecd7192-3761-7d6f-9151-5ff504ca740b to context.
ragproxyagent (to assistant):

You're a retrieve augmented coding assistant. You answer user's questions based on your own knowledge and the
context provided by the user.
If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.
For code generation, you must obey the following rules:
Rule 1. You MUST NOT install any packages because all the packages needed are already installed.
Rule 2. You must follow the formats below to write your code:
```language
# your code
```

User's question is: Who is the author of FLAML?

Context is: # Research

For technical details, please check our research publications.

- [FLAML: A Fast and Lightweight AutoML Library](https://www.microsoft.com/en-us/research/publication/flaml-a-fast-and-lightweight-automl-library/). Chi Wang, Qingyun Wu, Markus Weimer, Erkang Zhu. MLSys 2021.

```bibtex
@inproceedings{wang2021flaml,
    title={FLAML: A Fast and Lightweight AutoML Library},
    author={Chi Wang and Qingyun Wu and Markus Weimer and Erkang Zhu},
    year={2021},
    booktitle={MLSys},
}
```

- [Frugal Optimization for Cost-related Hyperparameters](https://arxiv.org/abs/2005.01571). Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.

```bibtex
@inproceedings{wu2021cfo,
    title={Frugal Optimization for Cost-related Hyperparameters},
    author={Qingyun Wu and Chi Wang and Silu Huang},
    year={2021},
    booktitle={AAAI},
}
```

- [Economical Hyperparameter Optimization With Blended Search Strategy](https://www.microsoft.com/en-us/research/publication/economical-hyperparameter-optimization-with-blended-search-strategy/). Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.

```bibtex
@inproceedings{wang2021blendsearch,
    title={Economical Hyperparameter Optimization With Blended Search Strategy},
    author={Chi Wang and Qingyun Wu and Silu Huang and Amin Saied},
    year={2021},
    booktitle={ICLR},
}
```

- [An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models](https://aclanthology.org/2021.acl-long.178.pdf). Susan Xueqing Liu, Chi Wang. ACL 2021.

```bibtex
@inproceedings{liuwang2021hpolm,
    title={An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models},
    author={Susan Xueqing Liu and Chi Wang},
    year={2021},
    booktitle={ACL},
}
```

- [ChaCha for Online AutoML](https://www.microsoft.com/en-us/research/publication/chacha-for-online-automl/). Qingyun Wu, Chi Wang, John Langford, Paul Mineiro and Marco Rossi. ICML 2021.

```bibtex
@inproceedings{wu2021chacha,
    title={ChaCha for Online AutoML},
    author={Qingyun Wu and Chi Wang and John Langford and Paul Mineiro and Marco Rossi},
    year={2021},
    booktitle={ICML},
}
```

- [Fair AutoML](https://arxiv.org/abs/2111.06495). Qingyun Wu, Chi Wang. ArXiv preprint arXiv:2111.06495 (2021).

```bibtex
@inproceedings{wuwang2021fairautoml,
    title={Fair AutoML},
    author={Qingyun Wu and Chi Wang},
    year={2021},
    booktitle={ArXiv preprint arXiv:2111.06495},
}
```

- [Mining Robust Default Configurations for Resource-constrained AutoML](https://arxiv.org/abs/2202.09927). Moe Kayali, Chi Wang. ArXiv preprint arXiv:2202.09927 (2022).

```bibtex
@inproceedings{kayaliwang2022default,
    title={Mining Robust Default Configurations for Resource-constrained AutoML},
    author={Moe Kayali and Chi Wang},
    year={2022},
    booktitle={ArXiv preprint arXiv:2202.09927},
}
```

- [Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives](https://openreview.net/forum?id=0Ij9_q567Ma). Shaokun Zhang, Feiran Jia, Chi Wang, Qingyun Wu. ICLR 2023 (notable-top-5%).

```bibtex
@inproceedings{zhang2023targeted,
    title={Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives},
    author={Shaokun Zhang and Feiran Jia and Chi Wang and Qingyun Wu},
    booktitle={International Conference on Learning Representations},
    year={2023},
    url={https://openreview.net/forum?id=0Ij9_q567Ma},
}
```

- [Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference](https://arxiv.org/abs/2303.04673). Chi Wang, Susan Xueqing Liu, Ahmed H. Awadallah. ArXiv preprint arXiv:2303.04673 (2023).

```bibtex
@inproceedings{wang2023EcoOptiGen,
    title={Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference},
    author={Chi Wang and Susan Xueqing Liu and Ahmed H. Awadallah},
    year={2023},
    booktitle={ArXiv preprint arXiv:2303.04673},
}
```

- [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337). Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang. ArXiv preprint arXiv:2306.01337 (2023).

```bibtex
@inproceedings{wu2023empirical,
    title={An Empirical Study on Challenging Math Problem Solving with GPT-4},
    author={Yiran Wu and Feiran Jia and Shaokun Zhang and Hangyu Li and Erkang Zhu and Yue Wang and Yin Tat Lee and Richard Peng and Qingyun Wu and Chi Wang},
    year={2023},
    booktitle={ArXiv preprint arXiv:2306.01337},
}
```



--------------------------------------------------------------------------------
assistant (to ragproxyagent):

The authors of FLAML are Chi Wang, Qingyun Wu, Markus Weimer, and Erkang Zhu.

--------------------------------------------------------------------------------

Using RetrieveChat with Qdrant for Retrieve Augmented Code Generation and Question Answering

Set your API Endpoint​

Construct agents for RetrieveChat​

You can find the list of all the embedding models supported by Qdrant here.​

Set your API Endpoint

Construct agents for RetrieveChat

You can find the list of all the embedding models supported by Qdrant here.