Instructors: John Davis, Ahmed Awan
This tutorial walks you through developing an extension to Galaxy, and how to contribute back to the core project.
- Learn about the main principles of Galaxy's code architecture in the context of adding a new feature
- Gain experience working on some of the main parts of the galaxy codebase:
- Learn how to modify Galaxy's data model and database schema
- Learn how to expose a new feature via the API
- Learn how to implement a new feature in the UI
- Learn about the main libraries used by Galaxy:
- Learn how to use Galaxy's testing frameworks while practicing test-driven development:
- Write, execute locally, interpret and debug failed unit tests, api tests, client tests
- Learn about the process of contributing to Galaxy
- Slides: Galaxy Code Architecture
- GitHub account, Git, Python, your favorite text editor
- Setting up Galaxy locally
- New feature to implement
- Updating the data model
- Database schema migrations
- Test Driven Development
- Implementing the API
- Building the UI
To contribute to galaxy, a GitHub account is required. Changes are proposed via a pull request. This allows the project maintainers to review the changes and suggest improvements.
The general steps are as follows:
- Fork the Galaxy repository
- Clone your fork
- Make changes in a new branch
- Commit your changes, push branch to your fork
- Open a pull request for this branch in the upstream Galaxy repository
- Use GitHub UI to fork Galaxy’s repository at galaxyproject/galaxy.
- Clone your forked repository to a local path, further referred to as GALAXY_ROOT and
cd
into GALAXY_ROOT. Note that we specify the tutorial branch with the-b
option:
$ git clone https://github.com/<your-username>/galaxy GALAXY_ROOT
$ cd GALAXY_ROOT
Before we can use Galaxy, we need to create a virtual environment and then activate it:
$ python3 -m venv .venv
$ source .venv/bin/activate
Once activated, you’ll see the name of the virtual environment prepended to your shell prompt: (.venv)$.
Next, let’s create a new git branch for your edits:
$ git checkout -b my-feature
Now when you run git branch you’ll see that your new branch is activated:
$ git branch
dev
* my-feature
Note: my-feature is just an example; you can call your new branch anything you like.
Finally, as one last step, we need to install the required dependencies and initialize the database. This only applies if you are working on a clean clone and have not started Galaxy. Initializing the database is necessary because you will be making changes to the database schema, which cannot be applied to a database that has not been initialized.
To install the required dependencies and initialize the database, you can simply start Galaxy by executing the following script (might take some time when executing for the first time):
$ sh run.sh
This will start Galaxy at http://127.0.0.1:8080
To setup the proposed extension imagine you’re running a specialized Galaxy server and each of your users only use a few of Galaxy datatypes. You’d like to tailor the UI experience by allowing users of Galaxy to select their favorite datatypes for additional filtering in downstream applications, UI extensions, etc.
Like many extensions to Galaxy, the proposed change requires persistent state. Galaxy stores most persistent state in a relational database. The Python layer that defines Galaxy’s data model is setup by defining SQLAlchemy models.
The proposed extension could be implemented in several different ways on Galaxy’s backend. We will choose one for this example for its simplicity, so we can focus on demonstrating how to modify and extend various layers of Galaxy.
With simplicity in mind, we will implement our proposed extension to Galaxy by adding a single new table to Galaxy’s data model called user_favorite_datatype
. The concept of a favorite datatype will be represented by a one-to-many relationship from the table that stores Galaxy’s user records to this new table. The datatype itself that will be marked as a favorite will be represented as a filename extension and stored as a text field in this new table. This table will also need to include an integer primary key named id
to follow the example set by the rest of the Galaxy data model.
Galaxy uses a relational database to persist objects and object relationships. Galaxy’s data model represents the object view of this data. To map objects and their relationships onto tables and rows in the database, Galaxy relies on SQLAlchemy, a SQL toolkit and object-relational mapper.
The mapping between objects and the database is defined in lib/galaxy/model/__init__.py
via declarative mapping, which means that models are defined as Python classes together with database metadata that specifies the database table corresponding to each class.
To implement the required changes to add the new model, you need to create a new class in lib/galaxy/model/__init__.py
- Add a class definition for your new class. You class should be a subclass of
Base
andRepresentById
. - Add a
__tablename__
attribute - Add a
mapped_column
attribute for the primary key (should be namedid
) - Add a
mapped_column
attribute to store the datatype - Add a
mapped_column
attribute that will serve as a reference (foreign key) to the User model - Use the relationship function to define an association between your new model and the
User
model.
You can add the new class definition right before the definition of the User
class. Here's what your class definition should look like:
class UserFavoriteDatatype(Base, RepresentById):
__tablename__ = "user_favorite_datatype"
id: Mapped[int] = mapped_column(primary_key=True)
user_id: Mapped[int] = mapped_column(ForeignKey("galaxy_user.id"))
datatype: Mapped[str] = mapped_column(String(255))
user: Mapped["User"] = relationship(back_populates="favorite_datatypes")
The defined relationship should be set on both tables, so you need to add a relationship to the User
class:
favorite_datatypes: Mapped[List["UserFavoriteDatatype"]] = relationship(back_populates="user")
When a pull request is submitted to the Galaxy repository on github, multiple github actions are triggered that run thousands of tests to verify the correctness of the proposed changes. It is good practice to at least run a liniting check against your code locally before opening a pull request.
First, we must install ruff:
$ pip install -r lib/galaxy/dependencies/pinned-lint-requirements.txt
Next, we can check our code:
$ ruff check .
All checks passed!
It is good practice to split up your contribution into logical parts: that makes it easier to review the changes, as well as debug the code when necessary, now or in the future. Now is a good time to commit your changes.
First, we review what we're about to commit:
$ git status
$ git diff
Now let's stage your changes:
$ git add --all
And, finally, commit, giving your commit an appropriate description:
$ git commit -m "Add UserFavoriteDatatype model"
And we're done with this step!
Here's what your commit will look like for this step: 3d10657b45c798826abeea896ce82e402f40a688
There is one last database issue to consider before moving on to the API. Each successive release of Galaxy requires recipes for how to migrate old database schemas to updated ones. These recipes are called revisions, or migrations, and are implemented using Alembic.
Database schema changes for Galaxy's data model are defined here: lib/galaxy/model/migrations/alembic/versions_gxy
We encourage you to read Galaxy’s documentation on migrations, as well as relevant Alembic documentation.
For this tutorial, you’ll need to do the following:
- Create a revision template.
- Edit the revision template, filling in the body of the upgrade and downgrade functions.
- Run the migration.
First, let's create a blank migration:
$ sh scripts/db_dev.sh revision -m "Add user_favorite_datatype table"
This command will generate a blank migration template and save it in the versions_gxy
directory (as noted above). The code in the template will look like this (your hashes will be different):
"""Add user_favorite_datatype table
Revision ID: 6b8c34a345f5
Revises: c716ee82337b
Create Date: 2025-06-22 17:53:20.645121
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = '6b8c34a345f5'
down_revision = 'c716ee82337b'
branch_labels = None
depends_on = None
def upgrade():
pass
def downgrade():
pass
To fill in the revision template, you need to populate the body of the upgrade and downgrade functions. The upgrade function is executed during a schema upgrade, so it should create your table. The downgrade function is executed during a schema downgrade, so it should drop your table. Note that although the table creation command looks similar to the one we used to define the model, it is not the same.
Your final migration code will look like this (note the required import statement):
import sqlalchemy as sa
from galaxy.model.migrations.util import (
create_table,
drop_table,
)
# revision identifiers, used by Alembic.
revision = "11a0b86fe248"
down_revision = "c716ee82337b"
branch_labels = None
depends_on = None
TABLE_NAME = "user_favorite_datatype"
def upgrade():
create_table(
TABLE_NAME,
sa.Column("id", sa.Integer, primary_key=True),
sa.Column("user_id", sa.Integer, sa.ForeignKey("galaxy_user.id")),
sa.Column("datatype", sa.String(255)),
)
def downgrade():
drop_table(TABLE_NAME)
Now we are ready to run the migration:
$ sh manage_db.sh upgrade
Activating virtualenv at .venv
INFO:alembic.runtime.migration:Context impl SQLiteImpl.
INFO:alembic.runtime.migration:Will assume non-transactional DDL.
INFO:alembic.runtime.migration:Running upgrade c716ee82337b -> 6b8c34a345f5, Add user_favorite_datatype table
DEBUG:alembic.runtime.migration:update c716ee82337b to 6b8c34a345f5
INFO:alembic.runtime.migration:Context impl SQLiteImpl.
INFO:alembic.runtime.migration:Will assume non-transactional DDL.
Now let's examine the database. You database is specified in the database_connection
property in the main Galaxy config file typically located in config/galaxy.yml
. We don't have a config file, so Galaxy is using the default: config/galaxy.yml.sample
. If you check that file, you'll see that the default database is located at database/universe.sqlte
. Let's examine the database: use the sqlite3
command to launch the database CLI, then use the .schema
command to view the definition of the new table that has been added:
$ sqlite3 database/universe.sqlite
SQLite version 3.34.1 2021-01-20 14:10:07
Enter ".help" for usage hints.
sqlite> .schema user_favorite_datatype
CREATE TABLE user_favorite_datatype (
id INTEGER NOT NULL,
user_id INTEGER,
datatype VARCHAR(255),
CONSTRAINT user_favorite_datatype_pkey PRIMARY KEY (id),
CONSTRAINT user_favorite_datatype_user_id_fkey FOREIGN KEY(user_id) REFERENCES galaxy_user (id)
);
You may also execute the manage_db.sh dv
command which will display the database version: one of the hashes will be the hash of the new migration:
$ sh manage_db.sh dv
Activating virtualenv at .venv
INFO:alembic.runtime.migration:Context impl SQLiteImpl.
INFO:alembic.runtime.migration:Will assume non-transactional DDL.
d4a650f47a3c (head)
6b8c34a345f5 (head)
Check your code and fix any errors if needed:
$ ruff check .
Commit the migration:
$ git add --all
$ git commit -m "Add migration"
Here's what your commit will look like for this step: 7ef8f88395c0afc00181b14c92d35ae06196463d
With the data model and database migration in place, we need to start adding the rest of the Python plumbing required to implement this feature. We will do this with a test-driven approach and start by implementing an API test that exercises operations we would like to have available for favorite extensions.
Note: API tests are only one example of tests we may want to write for this feature. We recommend studying Galaxy's documentation on how to write tests, as well as this GTN tutorial.
We will add a test case for user favorite datatypes in lib/galaxy_test/api/test_users.py
which is a module that contains tests for other user API endpoints. Various user-centered operations have endpoints under api/user/<user_id>
and api/user/current
is sometimes substituable as the current user. We will keep things simple and only implement this functionality for the current user.
Our test will test the following three simple API endpoints:
method | endpoint | description |
---|---|---|
GET | <galaxy_root_url>/api/users/current/favorite_datatypes |
This should return a list of favorited datatypes for the current user |
POST | <galaxy_root_url>/api/users/current/favorite_datatypes/<datatype> |
This should mark a datatype as a favorite for the current user |
DELETE | <galaxy_root_url>/api/users/current/favorite_datatypes/<datatype> |
This should unmark a datatype as a favorite for the current user |
Our test will do the following:
- Verify a user initially has no favorite datatypes.
- Verify that a POST to
<galaxy_root_url>/api/users/current/favorite_datatypes/fasta
returns a 200 status code indicating success. - Verify that after this POST the list of user favorited datatypes contains fasta and is of size 1.
- Verify that a DELETE to
<galaxy_root_url>/api/users/current/favorite_datatypes/fasta
succeeds. - Verify that after this DELETE the favorite datatypes list is again empty.
Here's one way to implement such a test (your version might be different):
def test_favorite_datatypes(self):
# verify no datatypes
index_response = self._get("users/current/favorite_datatypes")
index_response.raise_for_status()
index = index_response.json()
assert isinstance(index, list)
assert len(index) == 0
# add datatypesAdd commentMore actions
create_response = self._post("users/current/favorite_datatypes/fasta")
create_response.raise_for_status()
create_response = self._post("users/current/favorite_datatypes/fastq")
create_response.raise_for_status()
# verify added
index_response = self._get("users/current/favorite_datatypes")
index_response.raise_for_status()
index = index_response.json()
assert isinstance(index, list)
assert len(index) == 2
assert "fasta" in index
assert "fastq" in index
# delete datatype
delete_response = self._delete("users/current/favorite_datatypes/fasta")
delete_response.raise_for_status()
# verify deleted
index_response = self._get("users/current/favorite_datatypes")
index_response.raise_for_status()
index = index_response.json()
assert isinstance(index, list)
assert len(index) == 1
assert "fastq" in index
To run the test, we need to install pytest:
$ pip install pytest
Then we can run the new test we have just added:
pytest lib/galaxy_test/api/test_users.py::TestUsersApi::test_favorite_datatypes
(Note: if you see an error referring to psycopg*
modules, run pip uninstall pytest-postgresql
)
Of course, the test will fail, but that's expected: we haven't implemented the API yet.
Before we move on to the next step, let's check our code and commit.
Check your code and fix any errors if needed:
$ ruff check .
Commit the new test:
$ git add --all
$ git commit -m "Add API test"
Here's what your commit will look like for this step: 1c0a4aaadc407ee5030c1bef80124a262edc480b
We'll add our API implementation as a new module at lib/galaxy/webapps/galaxy/api/user_favorite_datatypes.py
.
We have three endpoints to implement. Galaxy relies on FastAPI to handle the low-level complexities of sending and receiving data via HTTP. We encourage you to study its excellent documentation, as well as the many examples of its usage in the context of Galaxy located at lib/galaxy/webapps/galaxy/api/
.
The overall approach is to define a class with three methods, one for each endpoint. The methods describe the endpoints and delegate the execution of the actual tasks to a user manager.
Here's the code you need for this step:
from typing import List
from fastapi import Path
from galaxy.managers.context import ProvidesUserContext
from galaxy.managers.users import UserManager
from . import (
depends,
DependsOnTrans,
Router,
)
router = Router(tags=["user_favorites"])
DatatypePath: str = Path(
..., # Mark this Path parameter as required
title="Datatype",
description="Target file extension for target operation.",
)
@router.cbv
class FastAPIUserFavoriteDatatypes:
user_manager: UserManager = depends(UserManager)
@router.get(
"/api/users/current/favorite_datatypes",
summary="List user favorite datatypes",
response_description="List of datatypes",
)
def index(
self,
trans: ProvidesUserContext = DependsOnTrans,
) -> List[str]:
"""Gets the list of user's favorite datatypes."""
return self.user_manager.get_favorite_datatypes(trans.user)
@router.post(
"/api/users/current/favorite_datatypes/{datatype}",
summary="Mark a datatype as the current user's favorite.",
)
def create(
self,
datatype: str = DatatypePath,
trans: ProvidesUserContext = DependsOnTrans,
) -> str:
self.user_manager.add_favorite_datatype(trans.user, datatype)
return datatype
@router.delete(
"/api/users/current/favorite_datatypes/{datatype}",
summary="Unmark a datatype as the current user's favorite.",
)
def delete(
self,
datatype: str = DatatypePath,
trans: ProvidesUserContext = DependsOnTrans,
) -> None:
self.user_manager.delete_favorite_datatype(trans.user, datatype)
Before we move on to the next step, let's check our code and commit.
Check your code and fix any errors if needed:
$ ruff check .
Commit the new test:
$ git add --all
$ git commit -m "Implement API"
Here's what your commit will look like for this step: 199f796b346a324c047984e86c0fa921885a0fa9
The code that handles the tasks described by the endpoints should be located in the user manager at
lib/galaxy/managers/users.py
.
The user manager is where the actual task is implemented. Each method interacts with the database via Galaxy's data access layer, implemented with the help of SQLAlchemy. The methods are surprisingly simple: retrieving a user's favorite datatypes is a one-liner: the relationship we added to the User model at the beginning of this tutorial handles all the complexity for us. The add and delete methods are almost as simple. To add an item we instantiate a new model, add it to the session (which is responsible for keeping the state of the data objects in Galaxy in sync with the rows in the database that reflect their state). To delete an item we create a delete statement that knows how to generate the required SQL, and let the session execute it. In all three cases, the complexity of transactional communication with the database is handled by SQLAlchemy.
Here's the code you need for this step. Note the additional import statement. The method definitions should be added to the UserManager
class; for example, you could add them
right after the __init__
method, but that's not a requirement.
from sqlalchemy import delete # You need this import!
# The following code should be added to the UserManager class:
def get_favorite_datatypes(self, user):
return [fd.datatype for fd in user.favorite_datatypes]
def add_favorite_datatype(self, user, datatype):
fd = model.UserFavoriteDatatype(datatype=datatype)
user.favorite_datatypes.append(fd)
self.session().add(user)
self.session().commit()
def delete_favorite_datatype(self, user, datatype):
stmt = delete(model.UserFavoriteDatatype).where(model.UserFavoriteDatatype.datatype == datatype)
self.session().execute(stmt)
self.session().commit()
Now we can finally rerun our test!
pytest lib/galaxy_test/api/test_users.py::TestUsersApi::test_favorite_datatypes
Your test should pass. If not: scroll up the beginning of the pytest output and read the error message carefully - it should point you to the source of the error.
Check your code and fix any errors if needed:
$ ruff check .
Commit the new test:
$ git add --all
$ git commit -m "Add manager methods"
Here's what your commit will look like for this step: a3dcd89d7ea195fb8edcbd524898dc61b46ac16d
Once the API test is done, it is time to build a user interface for this addition to Galaxy. Let’s get some of the plumbing out of the way right away. We’d like to have a URL for viewing the current user’s favorite datatypes in the UI. Let's use this one: /user_favorite_datatypes
(i.e., in the context of locally running galaxy, it might be http://127.0.0.1:8080/user_favorite_datatypes
.
The Galaxy front-end, or client, is made up of many components created in Vue; most of them use Vue's Composition API and are implemented in Typescript. To learn more we encourage you to study Vue's documentation. Of course, there are numerous online resources on JavaScript and TypeScript.
To enable our new route, at the minimum we need to do the following:
- Create a minimal Vue component (we'll worry about implementing the actual functionality in the next step)
- Register the new route on the backend
- Register the new route on the client.
We create the component at client/src/components/User/UserFavoriteDatatypes.vue
. We start with an empty script block and a minimal template:
<script setup lang="ts">
</script>
<template>
<div>
<h1>Favorite User Datatypes</h1>
<p>to do</p>
</div>
</template>
We register our route on the backend by adding the following line to lib/galaxy/webapps/galaxy/buildapp.py
:
webapp.add_client_route("/user_favorite_datatypes")
Add this line next to similar lines that register other client routes.
To register our route on the client, we modify client/src/entry/analysis/router.js
:
Import the new component:
import UserFavoriteDatatypes from "@/components/User/UserFavoriteDatatypes.vue";
Add the route to Analysis
routes in the getRouter
function:
{
path: "user_favorite_datatypes",
component: UserFavoriteDatatypes,
}
When we work on the client, it helpes to have a development server running that will rebuild and reload the code it serves every time you save. For this to work, we need to run the client development server and run Galaxy at the same time:
The following command starts the client development server at port 8081:
$ make client-dev-server
Now run Galaxy (on port 8080), but we can skip the client build this time, because we will be running the dev server proxied against 8080:
$ GALAXY_SKIP_CLIENT_BUILD=1 sh run.sh
OR, we can run the equivalent Make command:
$ make skip-client
NOTE: You must login before proceeding. Since this is a new Galaxy, you'll need to register (follow the normal registration steps, except you don't need a secure password since it's your local development copy.
Now you can view your empty page at http://127.0.0.1:8081/user_favorite_datatypes.
With the dev server running, whenever you make a change to the client code (this does not affect changes to the backend code), you will be able to see those changes live (it takes a few seconds to rebuild) at http://127.0.0.1:8081.
Note: the port is 8081; your regular Galaxy is running at port 8080.
$ git add --all
$ git commit -m "Add blank page and route"
Here's what your commit will look like for this step: edbf7790465d6296142ddfe80be328b78d676a79
Now we must implement the UI functionality that enables a user to select (and deselect) datatypes. For that, we will utilize four API endpoints: the three that we have added, and an existing one that returns all Galaxy datatypes: api/datatypes
. Assuming your Galaxy is running, you can view the data you'll need here:
- http://127.0.0.1:8080/api/datatypes: all datatypes
- http://127.0.0.1:8080/api/users/current/favorite_datatypes: favorite datatypes (empty so far)
Here's one way to design a simple UI for the required functionality:
- display a list of all datatypes
- if a datatype has not been marked as favorite, display a "+" link
- if a datatype has been marked as a favorit, display a "x" link
- clicking the "+" link marks the datatype as a favorite
- clicking the "x" link unmarks the datatype as a favorite
The following code implements this UI:
<script setup lang="ts">
import axios from "axios";
import { onMounted, ref } from "vue";
const datatypes = ref<any[]>([]);
const favoriteDatatypes = ref<any[]>([]);
onMounted(async () => {
await loadDatatypes();
await loadFavoriteDatatypes();
});
async function loadDatatypes() {
const path = 'api/datatypes';
try {
const response = await axios.get(path);
datatypes.value = response.data;
} catch (error) {
console.error(error);
}
}
async function loadFavoriteDatatypes() {
const path = 'api/users/current/favorite_datatypes';
try {
const response = await axios.get(path);
favoriteDatatypes.value = response.data.sort();
} catch (error) {
console.error(error);
}
}
async function mark(datatype) {
const path = `api/users/current/favorite_datatypes/${datatype}`;
try {
await axios.post(path);
await loadFavoriteDatatypes();
} catch (error) {
console.error(error);
}
}
async function unmark(datatype) {
const path = `api/users/current/favorite_datatypes/${datatype}`;
try {
await axios.delete(path);
await loadFavoriteDatatypes();
} catch (error) {
console.error(error);
}
}
</script>
<template>
<div>
<h1>Favorite Datatypes</h1>
<ul>
<li v-for="dt in datatypes" :key="dt">
{{ dt }}
<a @click="unmark(dt)" href="#" v-if="favoriteDatatypes.indexOf(dt) >= 0" class="red">X</a>
<a @click="mark(dt)" href="#" v-else class="green">+</a>
</li>
</ul>
</div>
</template>
<style>
.red {
color: red;
}
.green {
color: green;
}
</style>
If you'd like to see the same changes show up on the original proxy (on port 8080), run the run.sh
command to ensure that the client code is rebuilt (since we won't be skipping the client build this time):
$ sh run.sh
Now you can view your new page at the default port 8080: http://127.0.0.1:8080/user_favorite_datatypes.
Try it out! Try to mark and unmark datatypes.
(Note: if nothing happens, and your browser console output shows an error, check whether you are logged in!)
$ git add --all
$ git commit -m "Display all datatypes"
Here's what your commit will look like for this step: 43f2b74deb1980b7f7933e5c1ccd3211f511dbf0