Quick Start
This guide will help you run your first molecular machine learning workflow with DeepChem Server.
Before You Begin
Ensure you have:
DeepChem Server running (see Installation)
A sample dataset (CSV file with SMILES strings or molecular data)
Web browser for accessing the interactive API documentation
Interactive API Documentation
The best way to explore and test the API is through the interactive documentation:
Swagger UI: http://localhost:8000/docs
This provides:
Complete endpoint documentation with request/response schemas
Interactive request testing with real-time responses
Parameter descriptions and validation rules
Example requests and responses
Schema definitions for all data models
Basic Workflow
The typical molecular machine learning workflow involves:
Upload Data: Submit your molecular dataset to the server
Featurize: Transform molecules into machine learning features
TVTSplit: Split the dataset into training, validation, and test sets
Train: Build machine learning models on featurized data
Evaluate: Assess model performance
Infer: Make predictions on new data
Available Endpoints
Data Management
POST /data/uploaddata: Upload datasets to the datastoreGET /data/{dataset_id}/download: Download processed datasets
Primitive Operations
POST /primitive/featurize: Apply molecular featurizationPOST /primitive/train: Train machine learning modelsPOST /primitive/evaluate: Evaluate model performancePOST /primitive/infer: Run inference on new dataPOST /primitive/train-valid-test-split: Split datasets for training
System
GET /healthcheck: Check server health status
Python Client Library
For programmatic access, use the pyds Python client library:
from pyds import Settings, Data, Featurize, Train
# Configure settings
settings = Settings()
settings.set_profile("my_profile")
settings.set_project("my_project")
# Initialize clients
data_client = Data(settings)
featurize_client = Featurize(settings)
train_client = Train(settings)
# Upload and process data
response = data_client.upload_data("dataset.csv")
dataset_address = response['dataset_address']
# Featurize
response = featurize_client.run(
dataset_address=dataset_address,
featurizer="ECFP",
output="featurized_data",
dataset_column="smiles"
)
# Train model
response = train_client.run(
dataset_address=response['featurized_file_address'],
model_type="random_forest_classifier",
model_name="my_model"
)
For detailed Python client documentation, see PyDS library docs.
Troubleshooting
- Server Not Responding
Check if the server is running:
curl http://localhost:8000/healthcheck- Need More Information
Visit http://localhost:8000/docs for comprehensive API documentation and interactive testing