Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build schema registry like Hasura #23

Open
veeramarni opened this issue Oct 3, 2024 · 3 comments
Open

Build schema registry like Hasura #23

veeramarni opened this issue Oct 3, 2024 · 3 comments

Comments

@veeramarni
Copy link
Collaborator

To build an endpoint similar to Hasura, where you can dynamically manage GraphQL schemas, including adding or removing tables, fields, or remote schemas at runtime, you can create a custom server using Node.js, Express, and Apollo Server. Additionally, you will need to create endpoints to handle metadata changes and dynamically build or update your GraphQL schema.

Below is an example of a simple implementation that provides:

  1. A dynamic schema registry endpoint: For adding or removing schemas dynamically.
  2. A unified GraphQL endpoint: Serves as a single entry point for the dynamically registered schemas.

Step-by-Step Guide to Building a Hasura-like Dynamic GraphQL Endpoint

Step 1: Set Up the Server

First, let's initialize a basic Node.js server using Express and Apollo Server. We will also include graphql-tools to help with building and merging schemas dynamically.

  1. Install necessary packages:
npm install express apollo-server-express graphql graphql-tools
  1. Create the base server file (server.js):
const express = require('express');
const { ApolloServer } = require('apollo-server-express');
const { mergeSchemas, makeExecutableSchema } = require('@graphql-tools/schema');
const { gql } = require('apollo-server');

// In-memory schema storage
let schemas = [];

// Create an ApolloServer instance with an initial empty schema
const initialTypeDefs = gql`
  type Query {
    _empty: String
  }
`;

const initialResolvers = {
  Query: {
    _empty: () => '',
  },
};

// Create an executable schema with initial typeDefs and resolvers
let mergedSchema = makeExecutableSchema({ typeDefs: initialTypeDefs, resolvers: initialResolvers });

const app = express();

// Middleware to dynamically update schemas
app.use(express.json());

app.post('/add-schema', (req, res) => {
  const { typeDefs, resolvers } = req.body;

  try {
    // Parse and add the new schema dynamically
    const newSchema = makeExecutableSchema({ typeDefs: gql(typeDefs), resolvers });
    schemas.push(newSchema);

    // Merge all schemas into a single schema
    mergedSchema = mergeSchemas({
      schemas,
    });

    // Respond with success message
    res.status(200).json({ message: 'Schema added successfully!' });
  } catch (error) {
    res.status(400).json({ message: 'Error adding schema', error: error.message });
  }
});

// Endpoint to remove a schema by index (for simplicity)
app.post('/remove-schema', (req, res) => {
  const { index } = req.body;
  if (index >= 0 && index < schemas.length) {
    schemas.splice(index, 1);
    mergedSchema = mergeSchemas({ schemas });
    res.status(200).json({ message: 'Schema removed successfully!' });
  } else {
    res.status(400).json({ message: 'Invalid schema index' });
  }
});

// Apollo server instance using the merged schema
const server = new ApolloServer({
  schema: mergedSchema,
  context: ({ req }) => ({
    // Add any context if needed
  }),
});

// Apply middleware for ApolloServer
server.start().then(() => {
  server.applyMiddleware({ app, path: '/graphql' });

  // Start the express server
  app.listen(4000, () => {
    console.log('Server running on http://localhost:4000/graphql');
  });
});

Explanation:

  1. Schema Management:

    • An in-memory storage (schemas) is used to keep track of all the dynamically added schemas.
    • mergeSchemas from @graphql-tools/schema is used to combine multiple schemas into one unified schema that the Apollo Server uses.
  2. Dynamic Schema Registration:

    • The POST /add-schema endpoint allows new schemas to be added dynamically by sending a request with the new schema's type definitions (typeDefs) and resolvers (resolvers).
    • The server merges the new schema with the existing ones and updates the mergedSchema in the Apollo Server.
  3. Dynamic Schema Removal:

    • The POST /remove-schema endpoint allows you to remove a schema by specifying its index in the schemas array. This is for demonstration purposes and can be replaced with a more sophisticated schema management strategy.
  4. Apollo Server:

    • The Apollo Server is created using the merged schema and listens on /graphql for GraphQL queries.

Step 2: Dynamic Schema Management

  1. Add a New Schema

Use the following curl command to add a new schema to the server:

curl -X POST http://localhost:4000/add-schema \
  -H "Content-Type: application/json" \
  -d '{
    "typeDefs": "type Book { title: String author: String } type Query { books: [Book] }",
    "resolvers": {
      "Query": {
        "books": () => [{ "title": "Harry Potter", "author": "J.K. Rowling" }]
      }
    }
  }'

This command will add a new schema with a Book type and a books query.

  1. Query the New Schema

Once added, you can query the newly added schema by using the GraphQL playground or sending a query directly:

{
  books {
    title
    author
  }
}

Step 3: Enhancements and Considerations

  1. Persistent Storage: In this example, schemas are stored in memory, which means they are lost if the server restarts. Consider using a database or a configuration service to persist the schemas.
  2. Schema Versioning: Implement versioning for schemas to keep track of changes and provide rollback capabilities.
  3. Security: Ensure that only authorized users can add or remove schemas. Implement authentication and authorization mechanisms.
  4. Subscription Updates: If using subscriptions, ensure that schema updates propagate to all connected clients.

Final Thoughts

The above implementation gives you a basic structure for dynamically registering and managing GraphQL schemas, similar to Hasura's functionality. For production, consider adding more sophisticated metadata management, schema validation, and caching strategies to enhance the server's robustness and scalability.

@veeramarni
Copy link
Collaborator Author

If each tenant has its own custom schema, then the system needs to handle dynamic schema registration, storage, and retrieval per tenant. The challenge here is to isolate each tenant's schema while allowing efficient switching between schemas without restarting the server.

In a multi-tenant setup like this, the typical approach involves:

  1. Dynamic Schema Registration: Allow tenants to register or update their own schemas.
  2. Tenant Identification: Identify the tenant making the request using headers, URL paths, or tokens.
  3. Schema Storage and Retrieval: Store tenant-specific schemas in memory, a database, or another external storage.
  4. Schema Switching: Use middleware to select the appropriate schema based on tenant identification and serve it dynamically.

Let's build a complete solution using Express and Apollo Server that meets the following requirements:

  • Each tenant has their own isolated GraphQL schema.
  • Tenants can register or update their schemas dynamically.
  • The server can dynamically switch schemas based on the tenant making the request.
  • The server runs on a single port with shared endpoints, distinguished by tenant-specific headers or paths.

Implementation Strategy

We will build an Express server with the following features:

  1. In-Memory Schema Management: Store tenant schemas in an in-memory object.
  2. Dynamic Schema Registration Endpoint: Provide an endpoint for tenants to register or update their schemas.
  3. Tenant-Specific Query Endpoint: Create a single /graphql endpoint that can serve different schemas based on the tenant ID provided in the headers.
  4. Middleware for Schema Switching: Implement middleware to switch schemas dynamically based on the tenant ID.

Implementation

Step 1: Install Necessary Packages

npm install express apollo-server-express graphql graphql-tools uuid

Step 2: Create the Server File (server.js)

const express = require('express');
const { ApolloServer } = require('apollo-server-express');
const { makeExecutableSchema, mergeSchemas } = require('@graphql-tools/schema');
const { gql } = require('apollo-server');
const { v4: uuidv4 } = require('uuid');

// In-memory storage for tenant schemas
const tenantSchemas = {};

// Initialize Express app
const app = express();
app.use(express.json());

// Middleware to extract tenant ID from headers
const tenantMiddleware = (req, res, next) => {
  const tenantId = req.headers['x-tenant-id'];
  if (!tenantId) {
    return res.status(400).json({ error: 'Missing x-tenant-id header' });
  }
  req.tenantId = tenantId;
  next();
};

// Register new or update tenant schema
app.post('/register-schema', tenantMiddleware, (req, res) => {
  const { tenantId } = req;
  const { typeDefs, resolvers } = req.body;

  if (!typeDefs || !resolvers) {
    return res.status(400).json({ error: 'typeDefs and resolvers are required' });
  }

  try {
    // Convert resolvers from string to an executable function
    const resolversObject = eval(`(${resolvers})`);
    
    // Create a new schema for the tenant
    const schema = makeExecutableSchema({
      typeDefs: gql(typeDefs),
      resolvers: resolversObject,
    });

    // Store the schema in memory with the tenantId as key
    tenantSchemas[tenantId] = schema;
    return res.status(200).json({ message: `Schema registered/updated successfully for tenant ${tenantId}` });
  } catch (error) {
    console.error(error);
    return res.status(500).json({ error: 'Failed to register schema: ' + error.message });
  }
});

// Tenant-specific GraphQL endpoint
app.use('/graphql', tenantMiddleware, async (req, res, next) => {
  const { tenantId } = req;

  // Check if schema exists for the tenant
  const schema = tenantSchemas[tenantId];
  if (!schema) {
    return res.status(404).json({ error: `No schema found for tenant ID ${tenantId}` });
  }

  // Create a new Apollo Server instance for the tenant's schema
  const server = new ApolloServer({
    schema,
    context: () => ({ tenantId }), // Pass tenantId to context if needed in resolvers
  });

  await server.start();
  server.createHandler({ path: '/graphql' })(req, res);
});

// Start the Express server on a single port
app.listen(4000, () => {
  console.log(`Management endpoint: http://localhost:4000/register-schema`);
  console.log(`Tenant GraphQL endpoint: http://localhost:4000/graphql`);
});

Explanation

  1. Tenant Identification Middleware (tenantMiddleware):

    • This middleware extracts the tenant ID from the x-tenant-id header.
    • The tenant ID is used to look up the correct schema in the tenantSchemas object.
  2. Schema Registration Endpoint (/register-schema):

    • Tenants use this endpoint to register or update their own schema.
    • The schema is stored in the tenantSchemas object, using the tenantId as the key.
    • This endpoint expects typeDefs (schema definition) and resolvers (stringified resolver object) as part of the request body.
  3. Tenant-Specific GraphQL Endpoint (/graphql):

    • The endpoint serves GraphQL queries using the schema corresponding to the tenant ID.
    • If the tenant ID doesn’t have a registered schema, it returns a 404 error.
    • The ApolloServer instance is created dynamically for each request based on the tenant's schema.

How to Use the Server

  1. Register a Schema for a Tenant

Use the following curl command to register or update a schema for a tenant:

curl -X POST http://localhost:4000/register-schema \
  -H "Content-Type: application/json" \
  -H "x-tenant-id: tenant1" \
  -d '{
    "typeDefs": "type Book { title: String author: String } type Query { books: [Book] }",
    "resolvers": "{ Query: { books: () => [{ title: \\"Harry Potter\\", author: \\"J.K. Rowling\\" }] } }"
  }'

This command registers a new schema for tenant1 with a Book type and a books query.

  1. Query the Tenant Schema

Once the schema is registered, the tenant can query their own schema by sending requests to the /graphql endpoint with their x-tenant-id header:

curl -X POST http://localhost:4000/graphql \
  -H "Content-Type: application/json" \
  -H "x-tenant-id: tenant1" \
  -d '{
    "query": "{ books { title author } }"
  }'

This query will return the following response:

{
  "data": {
    "books": [
      { "title": "Harry Potter", "author": "J.K. Rowling" }
    ]
  }
}

Enhancements and Considerations

  1. Persistent Storage:

    • If you want tenant schemas to persist across server restarts, consider storing them in a database (e.g., MongoDB or PostgreSQL) and load them into memory during server startup.
  2. Handling Large Numbers of Tenants:

    • For scalability, consider using an in-memory caching system like Redis to store tenant schemas. You can also use a more efficient schema registry service.
  3. Security:

    • Implement proper authentication and authorization to ensure that only authorized users can register or update schemas.
    • Validate typeDefs and resolvers inputs before using them to avoid security risks.
  4. Performance Optimization:

    • Creating a new ApolloServer instance for each request can be costly. To optimize, consider caching ApolloServer instances for each tenant schema or using schema stitching if applicable.

Final Thoughts

This implementation provides a basic multi-tenant GraphQL server that allows each tenant to dynamically register their own schema and query it in an isolated manner. You can expand this further by adding persistent storage, enhanced security, and more advanced features such as schema versioning and tenant-specific optimizations.

@veeramarni
Copy link
Collaborator Author

Hasura does not create separate ports for each tenant. Instead, Hasura manages multi-tenancy on a single port using metadata configurations and role-based access control (RBAC) for schema isolation and access control.

How Hasura Handles Multi-Tenancy

  1. Single GraphQL Endpoint:

    • Hasura serves all tenants through a single GraphQL endpoint, usually at /v1/graphql.
    • The endpoint remains the same for all tenants, and there is no need for separate ports or URLs.
  2. Role-Based Access Control (RBAC):

    • Hasura uses roles to define permissions on tables and fields. Each tenant (or user) can have different roles, and the permissions are enforced based on these roles.
    • You can define roles such as tenant1_user, tenant2_admin, or any other custom role to manage access to specific tables, fields, or operations (queries, mutations, subscriptions).
  3. Row-Level Security:

    • Hasura provides row-level permissions to isolate data. This is usually achieved using session variables (e.g., X-Hasura-User-Id, X-Hasura-Role).
    • For example, you can define a rule that only allows a tenant to access rows in a table where the tenant_id column matches their X-Hasura-Tenant-Id session variable.
  4. Remote Schemas and Custom Permissions:

    • Hasura supports adding remote schemas, which can be different GraphQL services stitched together into a single Hasura endpoint. This way, different tenants can have different schemas integrated under the same endpoint.
    • Custom permissions and session variables help enforce schema isolation and access control based on tenants.
  5. Multi-tenant Architecture Options:

    • Single Database, Single Schema: Use row-level security to isolate tenant data in the same tables.
    • Single Database, Multiple Schemas: Create separate Postgres schemas for each tenant and configure permissions accordingly.
    • Multiple Databases: Use multiple databases (one for each tenant) if more isolation is needed, and configure each database separately in Hasura.

Managing Tenants with Hasura

  • Session Variables: Hasura uses session variables to identify and authorize tenant access. When making requests to Hasura, the tenant ID and role are typically passed as headers:

    curl -X POST http://localhost:8080/v1/graphql \
      -H "Content-Type: application/json" \
      -H "X-Hasura-Role: tenant1_user" \
      -H "X-Hasura-Tenant-Id: tenant1" \
      -d '{"query": "{ users { id name } }"}'
  • Access Control Configuration:

    • Define access control rules in the Hasura Console under the Permissions tab for each table and role.
    • Use conditions like { tenant_id: { _eq: "X-Hasura-Tenant-Id" } } to restrict access to rows based on the tenant.

How to Implement Multi-Tenancy in Hasura

If you're considering using Hasura for a multi-tenant setup, you can implement it in the following ways:

  1. Row-Level Permissions:

    • Create a tenant_id column in all your tables to identify which rows belong to which tenant.
    • Define permissions in Hasura to restrict access to rows based on the tenant_id column and the session variable X-Hasura-Tenant-Id.

    Example permission rule:

    {
      "tenant_id": {
        "_eq": "X-Hasura-Tenant-Id"
      }
    }
  2. Custom Roles for Tenants:

    • Create custom roles like tenant1_user and tenant2_user.
    • Define permissions for these roles to control which operations and tables each role can access.
  3. Separate Postgres Schemas for Each Tenant:

    • You can create separate Postgres schemas for each tenant if you want stronger isolation.
    • Configure Hasura to use these schemas and set permissions accordingly.
  4. Remote Schemas:

    • Integrate remote schemas specific to each tenant.
    • Use session variables and permissions to control access to the remote schemas.

Example: Row-Level Security in Hasura

  1. Create a Table with a tenant_id Column:

    CREATE TABLE orders (
      id serial PRIMARY KEY,
      item_name text,
      quantity integer,
      tenant_id text
    );
  2. Define Permissions in Hasura:

    Go to the Permissions tab in Hasura Console and set a permission rule like:

    {
      "tenant_id": {
        "_eq": "X-Hasura-Tenant-Id"
      }
    }
  3. Make a GraphQL Query with Tenant-specific Headers:

    curl -X POST http://localhost:8080/v1/graphql \
      -H "Content-Type: application/json" \
      -H "X-Hasura-Role: tenant1_user" \
      -H "X-Hasura-Tenant-Id: tenant1" \
      -d '{"query": "{ orders { id item_name quantity } }"}'

    This query will only return orders where tenant_id is tenant1.

Final Thoughts

Hasura does not require separate ports for each tenant because it leverages single-tenant schema configurations and role-based permissions to ensure data isolation and secure access. This simplifies management and deployment while providing flexibility to define complex multi-tenant access control mechanisms.

If you want more advanced multi-tenancy strategies, such as using a separate database for each tenant or managing different sets of remote schemas, you can configure Hasura accordingly, but the basic endpoint remains the same.

@veeramarni
Copy link
Collaborator Author

To implement a Hasura-like multi-tenant solution using Node.js, Express, and Apollo Server, we can create a basic server that manages tenants dynamically using a single endpoint and provides tenant-specific schemas and permissions.

Overview of the Implementation

This solution involves:

  1. Tenant Identification: Using headers to identify the tenant.
  2. Tenant Authentication and Authorization: Using an API key or session variable to authenticate and authorize tenants.
  3. Tenant-Specific Schema Management: Allowing tenants to register and manage their own schemas.
  4. Schema Isolation: Serving the schema based on the tenant's context, ensuring data isolation.

Full Code Implementation

Let's create a server that implements these features using Node.js, Express, and Apollo Server.

Step 1: Set Up the Project

  1. Create a new directory and initialize a Node.js project:

    mkdir multi-tenant-graphql
    cd multi-tenant-graphql
    npm init -y
  2. Install the required packages:

    npm install express apollo-server-express graphql graphql-tools uuid

Step 2: Create the Server File (server.js)

Create a file named server.js in the project directory and add the following code:

const express = require('express');
const { ApolloServer } = require('apollo-server-express');
const { makeExecutableSchema } = require('@graphql-tools/schema');
const { gql } = require('apollo-server');
const { v4: uuidv4 } = require('uuid');

// In-memory storage for tenant schemas and API keys
const tenantSchemas = {};
const tenantAPIKeys = {};

// Generate and store API keys for demo purposes
tenantAPIKeys['tenant1'] = uuidv4(); // Generate a random API key for tenant1
tenantAPIKeys['tenant2'] = uuidv4(); // Generate a random API key for tenant2

console.log('API Keys:');
console.log(tenantAPIKeys); // Log the API keys for testing

// Express application instance
const app = express();
app.use(express.json());

// Authentication Middleware: Validate API key and set tenantId
const authenticateTenant = (req, res, next) => {
  const apiKey = req.headers['x-api-key'];

  // Check if API key is provided
  if (!apiKey) {
    return res.status(401).json({ error: 'Missing API key' });
  }

  // Find tenant ID based on the API key
  const tenantId = Object.keys(tenantAPIKeys).find(
    (key) => tenantAPIKeys[key] === apiKey
  );

  if (!tenantId) {
    return res.status(403).json({ error: 'Invalid API key' });
  }

  // Set tenantId in the request context
  req.tenantId = tenantId;
  next();
};

// Authorization Middleware: Ensure tenant can only access their own schema
const authorizeTenant = (req, res, next) => {
  const { tenantId } = req;

  // Check if the tenant has a registered schema
  if (!tenantSchemas[tenantId]) {
    return res.status(404).json({ error: `No schema found for tenant ID ${tenantId}` });
  }

  next();
};

// Schema Registration Endpoint: Allows tenants to register their own schema
app.post('/register-schema', authenticateTenant, (req, res) => {
  const { tenantId } = req;
  const { typeDefs, resolvers } = req.body;

  if (!typeDefs || !resolvers) {
    return res.status(400).json({ error: 'typeDefs and resolvers are required' });
  }

  try {
    // Convert resolvers from string to an executable function
    const resolversObject = eval(`(${resolvers})`);

    // Create a new schema for the tenant
    const schema = makeExecutableSchema({
      typeDefs: gql(typeDefs),
      resolvers: resolversObject,
    });

    // Store the schema in memory with the tenantId as key
    tenantSchemas[tenantId] = schema;
    return res.status(200).json({ message: `Schema registered/updated successfully for tenant ${tenantId}` });
  } catch (error) {
    console.error(error);
    return res.status(500).json({ error: 'Failed to register schema: ' + error.message });
  }
});

// Tenant-specific GraphQL endpoint: Serves queries based on the tenant's schema
app.use('/graphql', authenticateTenant, authorizeTenant, async (req, res, next) => {
  const { tenantId } = req;

  // Get the schema corresponding to the tenantId
  const schema = tenantSchemas[tenantId];

  // Create a new ApolloServer instance for the tenant's schema
  const server = new ApolloServer({
    schema,
    context: () => ({ tenantId }), // Pass tenantId to context if needed in resolvers
  });

  await server.start();
  server.createHandler({ path: '/graphql' })(req, res);
});

// Start the Express server on a single port
app.listen(4000, () => {
  console.log(`Schema Registration endpoint: http://localhost:4000/register-schema`);
  console.log(`Tenant GraphQL endpoint: http://localhost:4000/graphql`);
  console.log(`Use the tenant-specific API key to access their schema.`);
});

Explanation of the Implementation

  1. Tenant Authentication Middleware (authenticateTenant):

    • This middleware authenticates tenants using an API key provided in the x-api-key header.
    • If the API key is valid, it sets the corresponding tenantId in the request context.
  2. Tenant Authorization Middleware (authorizeTenant):

    • This middleware ensures that a tenant can only access their registered schema.
    • If no schema is found for the tenant, it returns a 404 error.
  3. Schema Registration Endpoint (/register-schema):

    • Allows tenants to register or update their schemas dynamically.
    • This endpoint accepts typeDefs (schema type definitions) and resolvers as a stringified object.
  4. Tenant-Specific GraphQL Endpoint (/graphql):

    • Serves tenant-specific GraphQL queries using the registered schema for the tenant.
    • The ApolloServer instance is created dynamically based on the tenant’s schema.

How to Use the Server

  1. Start the Server:

    Run the server using the command:

    node server.js

    The server will start on port 4000 and log the API keys for each tenant.

  2. Register a Schema for a Tenant:

    Use the following curl command to register a schema for a tenant (replace YOUR_API_KEY with the logged API key for tenant1 or tenant2):

    curl -X POST http://localhost:4000/register-schema \
      -H "Content-Type: application/json" \
      -H "x-api-key: YOUR_API_KEY" \
      -d '{
        "typeDefs": "type Book { title: String author: String } type Query { books: [Book] }",
        "resolvers": "{ Query: { books: () => [{ title: \\"Harry Potter\\", author: \\"J.K. Rowling\\" }] } }"
      }'

    This command registers a new schema for the tenant with the given API key.

  3. Query the Tenant Schema:

    Once the schema is registered, you can query the tenant’s schema using the same API key in the x-api-key header:

    curl -X POST http://localhost:4000/graphql \
      -H "Content-Type: application/json" \
      -H "x-api-key: YOUR_API_KEY" \
      -d '{
        "query": "{ books { title author } }"
      }'

    This query will return:

    {
      "data": {
        "books": [
          { "title": "Harry Potter", "author": "J.K. Rowling" }
        ]
      }
    }

Security Considerations

  1. API Key Management:

    • For production, store API keys in a secure database and do not log them to the console.
    • Use environment variables or secret management tools to handle sensitive information.
  2. Schema Validation:

    • Use a schema validation library like graphql-tools to validate schema definitions and resolvers before registering them to prevent injection attacks.
  3. Rate Limiting:

    • Implement rate limiting on the /register-schema endpoint to prevent abuse by malicious tenants.
  4. Data Persistence:

    • Consider using a database to store schemas persistently instead of in-memory storage, so schemas are not lost if the server restarts.

Final Thoughts

This implementation provides a basic structure for a secure, multi-tenant GraphQL server with private tenant-specific schemas. You can expand it by adding persistent storage, enhanced security, and additional features like schema versioning, monitoring, and logging to build a robust multi-tenant GraphQL solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant