Skip to content

Commit 72136d0

Browse files
authored
Merge pull request #264 from posit-dev/feat-mcp-server-improve
feat: Improve the MCP server
2 parents 0c73655 + 7cf8bd5 commit 72136d0

File tree

6 files changed

+1783
-75
lines changed

6 files changed

+1783
-75
lines changed

docs/_quarto.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,9 @@ website:
9898
- user-guide/cli-data-inspection.qmd
9999
- user-guide/cli-data-validation.qmd
100100
- user-guide/cli-reference.qmd
101+
- section: "MCP Server"
102+
contents:
103+
- user-guide/mcp-quick-start.qmd
101104

102105
page-footer:
103106
left: 'Proudly supported by <a href="https://www.posit.co/" class="no-icon"><img src="/assets/posit-logo-black.svg" alt="Posit" width="80" style="padding-left: 3px;vertical-align:text-top;"></a>'
Lines changed: 248 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,248 @@
1+
---
2+
title: "MCP Quick Start"
3+
jupyter: python3
4+
toc-expand: 2
5+
html-table-processing: none
6+
---
7+
8+
## Getting Started in 5 Minutes
9+
10+
Transform your data validation workflow with conversational AI in VS Code or Positron IDE. Here are three simple steps to start validating data through conversation (and no complex configuration required).
11+
12+
### 1. Install
13+
14+
```bash
15+
pip install pointblank[mcp,pd,excel]
16+
```
17+
18+
**What this installs:**
19+
20+
- `mcp` - Model Context Protocol server dependencies
21+
- `pd` - pandas backend for data processing
22+
- `excel` - Excel file support (`openpyxl`)
23+
24+
**Alternative installs based on your needs:**
25+
26+
```bash
27+
# Minimal MCP server only
28+
pip install pointblank[mcp]
29+
30+
# Add Polars for faster data processing
31+
pip install pointblank[mcp,pd,pl]
32+
33+
# Full installation with all backends
34+
pip install pointblank[mcp,pd,pl,excel]
35+
```
36+
37+
### 2. Configure Your IDE
38+
39+
**For VS Code**:
40+
41+
**Option 1: Workspace Configuration (Recommended for teams)**
42+
43+
1. Create a `.vscode/mcp.json` file in your project folder
44+
2. Add this configuration:
45+
46+
```json
47+
{
48+
"servers": {
49+
"pointblank": {
50+
"command": "python",
51+
"args": ["-m", "pointblank_mcp_server.pointblank_server"]
52+
}
53+
}
54+
}
55+
```
56+
57+
**Option 2: User Configuration (Personal use)**
58+
59+
1. Run command: `MCP: Open User Configuration` (Cmd/Ctrl + Shift + P)
60+
2. Add the same JSON configuration above
61+
62+
> ⚠️ **Security Note**: Only add MCP servers from trusted sources. VS Code will ask you to confirm trust when starting the server for the first time.
63+
64+
**For Positron**:
65+
66+
1. Open Positron Settings
67+
2. Navigate to MCP Server configuration
68+
3. Add the configuration (format may vary)
69+
70+
> **Note**: If you don't see MCP settings, you may need to install an MCP extension first. Search for "MCP" in the Extensions marketplace.
71+
72+
### 3. Start Chatting
73+
74+
```
75+
"Load my sales data and check its quality"
76+
```
77+
78+
That's basically how you get started.
79+
80+
## Essential Commands
81+
82+
Master these four command patterns and you'll be able to handle most data validation scenarios. Think of these as your fundamental vocabulary for talking to Pointblank.
83+
84+
### Load Data
85+
86+
```
87+
"Load the file /path/to/data.csv"
88+
"Open my customer data from Downloads"
89+
"Load the Excel file with sales metrics"
90+
```
91+
92+
### Check Quality
93+
94+
```
95+
"Analyze the data quality"
96+
"What issues should I worry about?"
97+
"Check for missing values and duplicates"
98+
```
99+
100+
### Create Data Validations
101+
102+
```
103+
"Set up validation rules for this data"
104+
"Check that all emails are valid"
105+
"Make sure amounts are positive"
106+
```
107+
108+
### Run and Export
109+
110+
```
111+
"Run the validation"
112+
"Show me what failed"
113+
"Export problem rows to CSV"
114+
```
115+
116+
---
117+
118+
## Top Three Use Cases
119+
120+
Once you have Pointblank set up, these scenarios show the most common and powerful ways to use natural language data validation. Each example demonstrates the conversational flow and immediate value you'll experience.
121+
122+
### 1. 📊 Quick Data Check
123+
124+
**When**: You get a new dataset
125+
126+
```
127+
👤 "I just got customer_data.csv. Is it clean?"
128+
🤖 "Loading... Found 15K customers. Issues:
129+
- ✅ No duplicate IDs
130+
- ⚠️ 2% missing phones
131+
- ❌ 234 invalid emails
132+
Want me to fix the emails?"
133+
```
134+
135+
### 2. 🔄 Daily Validation
136+
137+
**When**: Regular data quality checks
138+
139+
```
140+
👤 "Run our standard sales validation"
141+
🤖 "Running daily_sales template...
142+
✅ All checks passed!
143+
3,247 transactions validated
144+
Ready for dashboard update"
145+
```
146+
147+
### 3. 🚨 Issue Investigation
148+
149+
**When**: Something looks wrong
150+
151+
```
152+
👤 "Conversion rates jumped to 95%. That's impossible"
153+
🤖 "Found the problem! Test data contamination:
154+
- 1,247 test emails with 100% conversion
155+
- Real rate is 4.2% (normal)
156+
Should I clean the data?"
157+
```
158+
159+
These examples show how natural conversation can quickly identify and resolve data quality issues that might take hours to diagnose manually.
160+
161+
## Common Validation Rules
162+
163+
Understanding what validation rules to ask for will help you quickly build comprehensive data quality checks. These examples cover the most frequent validation scenarios across different industries and data types.
164+
165+
### Data Integrity
166+
167+
- "Check for duplicate IDs"
168+
- "Ensure no missing required fields"
169+
- "Validate that dates are reasonable"
170+
171+
### Business Logic
172+
173+
- "Amounts must be positive"
174+
- "Email addresses must be valid format"
175+
- "Status must be active, inactive, or pending"
176+
177+
### Cross-Field Validation
178+
179+
- "End date must be after start date"
180+
- "Discount percentage between 0 and 100"
181+
- "Age must match birth date"
182+
183+
These rule patterns can be combined and customized for your specific data and business requirements. The natural language interface makes it easy to express complex validation logic without learning technical syntax.
184+
185+
## Some Tips and Tricks
186+
187+
These recommendations will help you get more value from your Pointblank MCP server and avoid some common pitfalls.
188+
189+
### Talk Naturally
190+
191+
**Good:** "Check if customer emails look valid"
192+
193+
**Avoid:** "Execute col_vals_regex on email column"
194+
195+
### Provide Context
196+
197+
**Good:** "This is for the board presentation"
198+
199+
**Avoid:** Just asking for validation without explanation
200+
201+
### Build Incrementally
202+
203+
1. Start with data profiling
204+
2. Add basic validation rules
205+
3. Create templates for reuse
206+
4. Set up automated checks
207+
208+
### Save Templates
209+
210+
```
211+
"Save these rules as 'customer_validation'"
212+
"Apply the financial_data template"
213+
"Use our standard survey validation"
214+
```
215+
216+
These practices help you build data quality workflows that scale with your needs while remaining accessible to those with varying technical backgrounds.
217+
218+
## File Support
219+
220+
Pointblank works with many major data file formats, making it easy to validate data regardless of how it's stored. This support means you can maintain consistent validation practices across your entire data ecosystem.
221+
222+
| Type | Extensions | Example |
223+
|------|------------|---------|
224+
| **CSV** | `.csv` | `sales_data.csv` |
225+
| **Excel** | `.xlsx`, `.xls` | `monthly_report.xlsx` |
226+
| **Parquet** | `.parquet` | `big_data.parquet` |
227+
| **JSON** | `.json`, `.jsonl` | `api_response.json` |
228+
229+
The consistent natural language interface works the same regardless of file format, so you can focus on validation logic rather than technical details.
230+
231+
## Quick Troubleshooting
232+
233+
When you encounter issues, these quick fixes resolve the most common problems. Furthermore, the natural language interface means you can always ask for help and explanations.
234+
235+
| Problem | Quick Fix |
236+
|---------|-----------|
237+
| "File not found" | Use full file path: `/Users/name/data.csv` |
238+
| Validation too slow | "Use a sample for testing" |
239+
| Don't understand error | "Explain why validation failed" |
240+
| Need help | "Show me examples of data quality checks" |
241+
242+
Remember, you can always ask the AI to explain what's happening or suggest solutions when you run into problems.
243+
244+
## Now You're Ready!
245+
246+
You now have everything needed to start validating data through conversation. The beauty of Pointblank's MCP server is that it grows with your expertise: start simple and gradually build more sophisticated validation workflows as you become comfortable with the interface.
247+
248+
Start with simple commands and build up to more complex validation workflows. The AI will guide you through the process and help you create robust data quality checks!

0 commit comments

Comments
 (0)