
AutoC is an automated tool designed to extract and analyze Indicators of Compromise (IoCs) from open-source threat intelligence sources.
- Threat Intelligence Parsing: Parses blogs, reports, and feeds from various OSINT sources.
- IoC Extraction: Automatically extracts IoCs such as IP addresses, domains, file hashes, and more.
- Visualization: Display extracted IoCs and analysis in a user-friendly interface.
Fastest way to get started with AutoC is to run it using Docker (with docker-compose
).
Make sure to set up the .env
file with your API keys before running the app (See Configuration section below for more details).
git clone https://github.com/barvhaim/AutoC.git
cd AutoC
docker-compose up --build
Once the app is up and running, you can access it at http://localhost:8000
- With crawl4ai:
docker-compose --profile crawl4ai up --build
- With Milvus vector database:
docker-compose --profile milvus up --build
- With both:
docker-compose --profile crawl4ai --profile milvus up --build
- Install Python 3.11 or later. (https://www.python.org/downloads/)
- Install
uv
package manager (https://docs.astral.sh/uv/getting-started/installation/)- For Linux and MacOS, you can use the following command:
curl -LsSf https://astral.sh/uv/install.sh | sh
- For Windows, you can use the following command:
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
- For Linux and MacOS, you can use the following command:
- Clone the project repository and navigate to the project directory.
git clone https://github.com/barvhaim/AutoC.git cd AutoC
- Install the required Python packages using
uv
.uv sync
- Configure the
.env
file with your API keys (See Configuration section below for more details).
Set up API keys by adding them to the .env
file (Use .env.example
file as a template).
You can use either of multiple LLM providers (IBM WatsonX, OpenAI), you will configure which one to use in the next step.
cp .env.example .env
- watsonx.ai by IBM ("watsonx") Get API Key
- OpenAI ("openai") - Experimental
- RITS internal IBM ("rits")
- Ollama ("ollama") - Experimental
Provider (LLM_PROVIDER) | Models (LLM_MODEL) |
---|---|
watsonx.ai by IBM (watsonx) | - meta-llama/llama-3-3-70b-instruct - ibm-granite/granite-3.1-8b-instruct |
RITS (rits) | - meta-llama/llama-3-3-70b-instruct - ibm-granite/granite-3.1-8b-instruct - deepseek-ai/DeepSeek-V3 |
OpenAI (openai) | - gpt-4.1-nano |
Ollama (ollama) Experimental | - granite3.2:8b |
By default, AutoC uses combination of docling and beautifulsoup4 libraries to extract blog posts content, which behind the scenes uses requests
library to fetch the blog post content.
There is an option to use Crawl4AI that uses a headless browser to fetch the blog post content, which is more reliable, but requires additional setup.
To enable Crawl4AI, you need Crawl4AI backend server, which can be run using Docker:
docker-compose --profile crawl4ai up -d
The crawl4ai service uses a profile configuration, so it only starts when explicitly requested with the --profile crawl4ai
flag.
And then set the environment variables in the .env
file to point to the Crawl4AI server:
USE_CRAWL4AI_HEADLESS_BROWSER_HTML_PARSER=true
CRAWL4AI_BASE_URL=http://localhost:11235
AutoC processes analyst questions about articles in two modes:
- Individual mode (default): Each question is processed separately with individual LLM calls
- Batch mode: All questions are processed together in a single LLM call for improved performance
To enable batch mode, set the environment variable in the .env
file:
QNA_BATCH_MODE=true
You can also control this via API settings by including "qna_batch_mode": true
in your request.
Benefits of batch mode:
- Reduces number of API calls from N questions to 1 call
- Potentially faster processing for multiple questions
- More cost-effective for large question sets
- Automatic fallback to individual mode if batch processing fails
AutoC supports Retrieval-Augmented Generation (RAG) for intelligent context retrieval during Q&A processing:
- Standard mode (default): Uses the entire article content as context for answering questions
- RAG mode: Intelligently retrieves only the most relevant chunks of content for each question
To enable RAG mode, set the environment variable in the .env
file:
QNA_RAG_MODE=true
You can also control this via API settings by including "qna_rag_mode": true
in your request.
Benefits of RAG mode:
- More targeted and relevant answers by focusing on specific content sections
- Improved answer quality for long articles by reducing noise
- Better handling of multi-topic articles
- Automatic content chunking and semantic search
- Efficient processing of large documents
Note: RAG mode only works with individual Q&A processing mode. When batch mode (QNA_BATCH_MODE=true
) is enabled, RAG mode is automatically disabled and the full article content is used as context.
RAG Configuration:
RAG mode requires a Milvus vector database. Configure the connection in your .env
file:
RAG_MILVUS_HOST=localhost
RAG_MILVUS_PORT=19530
RAG_MILVUS_USER=
RAG_MILVUS_PASSWORD=
RAG_MILVUS_SECURE=false
To run Milvus with Docker:
docker-compose --profile milvus up -d
How it works:
- Article content is automatically chunked and indexed into Milvus vector store
- For each analyst question, the most relevant content chunks are retrieved
- Only the relevant context is sent to the LLM for answer generation
- Vector store is automatically cleaned up after processing
AutoC can detect MITRE ATT&CK TTPs in the blog post content, which can be used to identify the techniques and tactics used by the threat actors.
To enable MITRE ATT&CK TTPs detection, you need to set the environment variable in the .env
file:
HF_TOKEN=<your_huggingface_token>
DETECT_MITRE_TTPS_MODEL_PATH=dvir056/mitre-ttp # Hugging Face model path for MITRE ATT&CK TTPs detection
Information about model training: https://github.com/barvhaim/attack-ttps-detection?tab=readme-ov-file#-mitre-attck-ttps-classification
Run the AutoC tool with the following command:
uv run python cli.py extract --help (to see the available options)
uv run python cli.py extract --url <blog_post_url>


Assuming the app .env
file is configured correctly, you can run the app using one of the following options:
For running the app locally, you'll need node
20 and npm
installed on your machine. We recommend using nvm for managing node versions.
cd frontend
nvm use
npm install
npm run build
Once the build is complete, you can run the app using the following command from the root directory:
cd ..
uv run python -m uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4
One the app is up and running, you can access it at http://localhost:8000
For development purposes, you can run the app in development mode using the following command:
Start the backend server:
uv run python -m uvicorn main:app --reload
and in a separate terminal, start the frontend development server:
cd frontend
nvm use
npm install
npm run build
npm run dev
Once the app is up and running, you can access it at http://localhost:5173

Make sure you have Claude Desktop installed, uv
package manager and Python installed on your machine.
Clone the project repository and navigate to the project directory.
Install the required Python packages using uv
.
uv sync
Edit claude desktop config file and add the following lines to the mcpServers
section:
{
"mcpServers": {
"AutoC": {
"command": "uv",
"args": [
"--directory",
"/PATH/TO/AutoC",
"run",
"mcp_server.py"
]
}
}
}
Restart the app, you should see the AutoC MCP server in the list of available MCP servers.