Getting Started#
g4hunterpy3 is a Python 3 implementation of the G4Hunter algorithm for predicting G-quadruplex (G4) propensity in DNA sequences. It provides both a Python API and a command-line interface (CLI).
What is G4Hunter?#
G4Hunter is a bioinformatic tool designed to predict the formation propensity of G-quadruplexes (G4) — four-stranded nucleic acid structures stabilized by guanine quartets. Unlike pattern-matching algorithms that rely on rigid consensus sequences, G4Hunter uses a scoring system based on G-richness and G-skewness (see Bedrat et al. 2016).
Setting up a Conda Environment#
We recommend using a dedicated conda environment. You can create one with:
conda create -n g4hunter python=3.11 -y
conda activate g4hunter
Or if you use mamba:
mamba create -n g4hunter python=3.11 -y
mamba activate g4hunter
Installation#
Install directly from GitHub (no cloning required)#
The simplest way to install g4hunterpy3 is directly from the GitHub repository:
pip install git+https://github.com/holehouse-lab/g4hunterpy3.git
This installs the latest version from the main branch. To install a specific
branch or tag, append @<branch-or-tag>:
# install from a specific branch
pip install git+https://github.com/holehouse-lab/g4hunterpy3.git@dev
# install from a specific release tag
pip install git+https://github.com/holehouse-lab/g4hunterpy3.git@v1.0.0
Install from a local clone#
If you want to work on the source or run the test suite:
git clone https://github.com/holehouse-lab/g4hunterpy3.git
cd g4hunterpy3
pip install .
# or in editable / development mode:
pip install -e .
Dependencies#
g4hunterpy3 depends on:
Python >= 3.8
numpy
matplotlib
protfasta
These are installed automatically by pip.
Verify the installation#
After installation, verify everything is working:
import g4hunterpy3
print(g4hunterpy3.__version__)
You can also check that the CLI is available:
g4hunterpy3 --help
Quick example#
Scan a sequence in Python:
from g4hunterpy3.core import scan_sequence
seq = "ATGGGGATTTTGGGGCCCGGGGATTTGGGG"
window_scores, hits, regions = scan_sequence(seq, window_size=10, threshold=1.0)
print(f"Found {len(hits)} window hits and {len(regions)} merged regions")
for r in regions:
print(f" Region {r.start}-{r.end}: score={r.score}, seq={r.sequence}")
Or from the command line:
g4hunterpy3 -i input.fasta -o results/ -w 25 -s 1.2 --info