1. Search for Papers
Use for: Finding papers by title, abstract, or topic
```python
# Simple search
results = client.search_works(
search="machine learning",
per_page=100
)
# Search with filters
results = client.search_works(
search="CRISPR gene editing",
filter_params={
"publication_year": ">2020",
"is_oa": "true"
},
sort="cited_by_count:desc"
)
```
2. Find Works by Author
Use for: Getting all publications by a specific researcher
Use the two-step pattern (entity name β ID β works):
```python
from scripts.query_helpers import find_author_works
works = find_author_works(
author_name="Jennifer Doudna",
client=client,
limit=100
)
```
Manual two-step approach:
```python
# Step 1: Get author ID
author_response = client._make_request(
'/authors',
params={'search': 'Jennifer Doudna', 'per-page': 1}
)
author_id = author_response['results'][0]['id'].split('/')[-1]
# Step 2: Get works
works = client.search_works(
filter_params={"authorships.author.id": author_id}
)
```
3. Find Works from Institution
Use for: Analyzing research output from universities or organizations
```python
from scripts.query_helpers import find_institution_works
works = find_institution_works(
institution_name="Stanford University",
client=client,
limit=200
)
```
4. Highly Cited Papers
Use for: Finding influential papers in a field
```python
from scripts.query_helpers import find_highly_cited_recent_papers
papers = find_highly_cited_recent_papers(
topic="quantum computing",
years=">2020",
client=client,
limit=100
)
```
5. Open Access Papers
Use for: Finding freely available research
```python
from scripts.query_helpers import get_open_access_papers
papers = get_open_access_papers(
search_term="climate change",
client=client,
oa_status="any", # or "gold", "green", "hybrid", "bronze"
limit=200
)
```
6. Publication Trends Analysis
Use for: Tracking research output over time
```python
from scripts.query_helpers import get_publication_trends
trends = get_publication_trends(
search_term="artificial intelligence",
filter_params={"is_oa": "true"},
client=client
)
# Sort and display
for trend in sorted(trends, key=lambda x: x['key'])[-10:]:
print(f"{trend['key']}: {trend['count']} publications")
```
7. Research Output Analysis
Use for: Comprehensive analysis of author or institution research
```python
from scripts.query_helpers import analyze_research_output
analysis = analyze_research_output(
entity_type='institution', # or 'author'
entity_name='MIT',
client=client,
years='>2020'
)
print(f"Total works: {analysis['total_works']}")
print(f"Open access: {analysis['open_access_percentage']}%")
print(f"Top topics: {analysis['top_topics'][:5]}")
```
8. Batch Lookups
Use for: Getting information for multiple DOIs, ORCIDs, or IDs efficiently
```python
dois = [
"https://doi.org/10.1038/s41586-021-03819-2",
"https://doi.org/10.1126/science.abc1234",
# ... up to 50 DOIs
]
works = client.batch_lookup(
entity_type='works',
ids=dois,
id_field='doi'
)
```
9. Random Sampling
Use for: Getting representative samples for analysis
```python
# Small sample
works = client.sample_works(
sample_size=100,
seed=42, # For reproducibility
filter_params={"publication_year": "2023"}
)
# Large sample (>10k) - automatically handles multiple requests
works = client.sample_works(
sample_size=25000,
seed=42,
filter_params={"is_oa": "true"}
)
```
10. Citation Analysis
Use for: Finding papers that cite a specific work
```python
# Get the work
work = client.get_entity('works', 'https://doi.org/10.1038/s41586-021-03819-2')
# Get citing papers using cited_by_api_url
import requests
citing_response = requests.get(
work['cited_by_api_url'],
params={'mailto': client.email, 'per-page': 200}
)
citing_works = citing_response.json()['results']
```
11. Topic and Subject Analysis
Use for: Understanding research focus areas
```python
# Get top topics for an institution
topics = client.group_by(
entity_type='works',
group_field='topics.id',
filter_params={
"authorships.institutions.id": "I136199984", # MIT
"publication_year": ">2020"
}
)
for topic in topics[:10]:
print(f"{topic['key_display_name']}: {topic['count']} works")
```
12. Large-Scale Data Extraction
Use for: Downloading large datasets for analysis
```python
# Paginate through all results
all_papers = client.paginate_all(
endpoint='/works',
params={
'search': 'synthetic biology',
'filter': 'publication_year:2020-2024'
},
max_results=10000
)
# Export to CSV
import csv
with open('papers.csv', 'w', newline='', encoding='utf-8') as f:
writer = csv.writer(f)
writer.writerow(['Title', 'Year', 'Citations', 'DOI', 'OA Status'])
for paper in all_papers:
writer.writerow([
paper.get('title', 'N/A'),
paper.get('publication_year', 'N/A'),
paper.get('cited_by_count', 0),
paper.get('doi', 'N/A'),
paper.get('open_access', {}).get('oa_status', 'closed')
])
```