Skip to content
/ AutoC Public

An automated tool designed to extract and analyze Indicators of Compromise (IoCs) from open-source threat intelligence sources

License

Notifications You must be signed in to change notification settings

barvhaim/AutoC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

44 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

image

AutoC

AutoC is an automated tool designed to extract and analyze Indicators of Compromise (IoCs) from open-source threat intelligence sources.

Features

  • Threat Intelligence Parsing: Parses blogs, reports, and feeds from various OSINT sources.
  • IoC Extraction: Automatically extracts IoCs such as IP addresses, domains, file hashes, and more.
  • Visualization: Display extracted IoCs and analysis in a user-friendly interface.

Getting Started

πŸš€ Quick Start

Fastest way to get started with AutoC is to run it using Docker (with docker-compose).

Make sure to set up the .env file with your API keys before running the app (See Configuration section below for more details).

git clone https://github.com/barvhaim/AutoC.git
cd AutoC
docker-compose up --build

Once the app is up and running, you can access it at http://localhost:8000

Optional Services

  • With crawl4ai: docker-compose --profile crawl4ai up --build
  • With Milvus vector database: docker-compose --profile milvus up --build
  • With both: docker-compose --profile crawl4ai --profile milvus up --build

πŸ“¦ Installation

  1. Install Python 3.11 or later. (https://www.python.org/downloads/)
  2. Install uv package manager (https://docs.astral.sh/uv/getting-started/installation/)
    • For Linux and MacOS, you can use the following command:
      curl -LsSf https://astral.sh/uv/install.sh | sh
    • For Windows, you can use the following command:
      powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
  3. Clone the project repository and navigate to the project directory.
    git clone https://github.com/barvhaim/AutoC.git
    cd AutoC
  4. Install the required Python packages using uv.
    uv sync
  5. Configure the .env file with your API keys (See Configuration section below for more details).

πŸ”‘ Configuration

Set up API keys by adding them to the .env file (Use .env.example file as a template). You can use either of multiple LLM providers (IBM WatsonX, OpenAI), you will configure which one to use in the next step.

cp .env.example .env

Supported LLM providers:

  • watsonx.ai by IBM ("watsonx") Get API Key
  • OpenAI ("openai") - Experimental
  • RITS internal IBM ("rits")
  • Ollama ("ollama") - Experimental

Suggested models by provider:

Provider (LLM_PROVIDER) Models (LLM_MODEL)
watsonx.ai by IBM (watsonx) - meta-llama/llama-3-3-70b-instruct
-ibm-granite/granite-3.1-8b-instruct
RITS (rits) - meta-llama/llama-3-3-70b-instruct
- ibm-granite/granite-3.1-8b-instruct
-deepseek-ai/DeepSeek-V3
OpenAI (openai) - gpt-4.1-nano
Ollama (ollama) Experimental - granite3.2:8b

Enhanced Blog post extraction (optional)

By default, AutoC uses combination of docling and beautifulsoup4 libraries to extract blog posts content, which behind the scenes uses requests library to fetch the blog post content.

There is an option to use Crawl4AI that uses a headless browser to fetch the blog post content, which is more reliable, but requires additional setup.

To enable Crawl4AI, you need Crawl4AI backend server, which can be run using Docker:

docker-compose --profile crawl4ai up -d

The crawl4ai service uses a profile configuration, so it only starts when explicitly requested with the --profile crawl4ai flag.

And then set the environment variables in the .env file to point to the Crawl4AI server:

USE_CRAWL4AI_HEADLESS_BROWSER_HTML_PARSER=true
CRAWL4AI_BASE_URL=http://localhost:11235

Q&A Batch Mode (optional)

AutoC processes analyst questions about articles in two modes:

  • Individual mode (default): Each question is processed separately with individual LLM calls
  • Batch mode: All questions are processed together in a single LLM call for improved performance

To enable batch mode, set the environment variable in the .env file:

QNA_BATCH_MODE=true

You can also control this via API settings by including "qna_batch_mode": true in your request.

Benefits of batch mode:

  • Reduces number of API calls from N questions to 1 call
  • Potentially faster processing for multiple questions
  • More cost-effective for large question sets
  • Automatic fallback to individual mode if batch processing fails

Q&A RAG Mode (optional)

AutoC supports Retrieval-Augmented Generation (RAG) for intelligent context retrieval during Q&A processing:

  • Standard mode (default): Uses the entire article content as context for answering questions
  • RAG mode: Intelligently retrieves only the most relevant chunks of content for each question

To enable RAG mode, set the environment variable in the .env file:

QNA_RAG_MODE=true

You can also control this via API settings by including "qna_rag_mode": true in your request.

Benefits of RAG mode:

  • More targeted and relevant answers by focusing on specific content sections
  • Improved answer quality for long articles by reducing noise
  • Better handling of multi-topic articles
  • Automatic content chunking and semantic search
  • Efficient processing of large documents

Note: RAG mode only works with individual Q&A processing mode. When batch mode (QNA_BATCH_MODE=true) is enabled, RAG mode is automatically disabled and the full article content is used as context.

RAG Configuration: RAG mode requires a Milvus vector database. Configure the connection in your .env file:

RAG_MILVUS_HOST=localhost
RAG_MILVUS_PORT=19530
RAG_MILVUS_USER=
RAG_MILVUS_PASSWORD=
RAG_MILVUS_SECURE=false

To run Milvus with Docker:

docker-compose --profile milvus up -d

How it works:

  1. Article content is automatically chunked and indexed into Milvus vector store
  2. For each analyst question, the most relevant content chunks are retrieved
  3. Only the relevant context is sent to the LLM for answer generation
  4. Vector store is automatically cleaned up after processing

MITRE ATT&CK TTPs detection (optional)

AutoC can detect MITRE ATT&CK TTPs in the blog post content, which can be used to identify the techniques and tactics used by the threat actors. To enable MITRE ATT&CK TTPs detection, you need to set the environment variable in the .env file:

HF_TOKEN=<your_huggingface_token>
DETECT_MITRE_TTPS_MODEL_PATH=dvir056/mitre-ttp  # Hugging Face model path for MITRE ATT&CK TTPs detection

Information about model training: https://github.com/barvhaim/attack-ttps-detection?tab=readme-ov-file#-mitre-attck-ttps-classification

πŸ“ Usage

Run the AutoC tool with the following command:

uv run python cli.py extract --help (to see the available options)
uv run python cli.py extract --url <blog_post_url>
Image

πŸ§‘β€πŸ’» Bonus - Try our UI

Image

πŸƒUp and running options:

Assuming the app .env file is configured correctly, you can run the app using one of the following options:

Running the app

For running the app locally, you'll need node 20 and npm installed on your machine. We recommend using nvm for managing node versions.

cd frontend
nvm use
npm install
npm run build

Once the build is complete, you can run the app using the following command from the root directory:

cd ..
uv run python -m uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4

One the app is up and running, you can access it at http://localhost:8000

Development

For development purposes, you can run the app in development mode using the following command:

Start the backend server:

uv run python -m uvicorn main:app --reload

and in a separate terminal, start the frontend development server:

cd frontend
nvm use
npm install
npm run build
npm run dev

Once the app is up and running, you can access it at http://localhost:5173

πŸ”¨ MCP tool for Claude Desktop (Experimental)

Image

Make sure you have Claude Desktop installed, uv package manager and Python installed on your machine. Clone the project repository and navigate to the project directory.

Install the required Python packages using uv.

uv sync

Edit claude desktop config file and add the following lines to the mcpServers section:

{
  "mcpServers": {
    "AutoC": {
      "command": "uv",
      "args": [
        "--directory",
        "/PATH/TO/AutoC",
        "run",
        "mcp_server.py"
      ]
    }
  }
}

Restart the app, you should see the AutoC MCP server in the list of available MCP servers.

About

An automated tool designed to extract and analyze Indicators of Compromise (IoCs) from open-source threat intelligence sources

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •