Getting Started#

g4hunterpy3 is a Python 3 implementation of the G4Hunter algorithm for predicting G-quadruplex (G4) propensity in DNA sequences. It provides both a Python API and a command-line interface (CLI).

What is G4Hunter?#

G4Hunter is a bioinformatic tool designed to predict the formation propensity of G-quadruplexes (G4) — four-stranded nucleic acid structures stabilized by guanine quartets. Unlike pattern-matching algorithms that rely on rigid consensus sequences, G4Hunter uses a scoring system based on G-richness and G-skewness (see Bedrat et al. 2016).

Setting up a Conda Environment#

We recommend using a dedicated conda environment. You can create one with:

conda create -n g4hunter python=3.11 -y
conda activate g4hunter

Or if you use mamba:

mamba create -n g4hunter python=3.11 -y
mamba activate g4hunter

Installation#

Install directly from GitHub (no cloning required)#

The simplest way to install g4hunterpy3 is directly from the GitHub repository:

pip install git+https://github.com/holehouse-lab/g4hunterpy3.git

This installs the latest version from the main branch. To install a specific branch or tag, append @<branch-or-tag>:

# install from a specific branch
pip install git+https://github.com/holehouse-lab/g4hunterpy3.git@dev

# install from a specific release tag
pip install git+https://github.com/holehouse-lab/g4hunterpy3.git@v1.0.0

Install from a local clone#

If you want to work on the source or run the test suite:

git clone https://github.com/holehouse-lab/g4hunterpy3.git
cd g4hunterpy3
pip install .

# or in editable / development mode:
pip install -e .

Dependencies#

g4hunterpy3 depends on:

  • Python >= 3.8

  • numpy

  • matplotlib

  • protfasta

These are installed automatically by pip.

Verify the installation#

After installation, verify everything is working:

import g4hunterpy3
print(g4hunterpy3.__version__)

You can also check that the CLI is available:

g4hunterpy3 --help

Quick example#

Scan a sequence in Python:

from g4hunterpy3.core import scan_sequence

seq = "ATGGGGATTTTGGGGCCCGGGGATTTGGGG"
window_scores, hits, regions = scan_sequence(seq, window_size=10, threshold=1.0)

print(f"Found {len(hits)} window hits and {len(regions)} merged regions")
for r in regions:
    print(f"  Region {r.start}-{r.end}: score={r.score}, seq={r.sequence}")

Or from the command line:

g4hunterpy3 -i input.fasta -o results/ -w 25 -s 1.2 --info