Scrapebase + Permit.io: Web Scraping with API-First Authorization
This is a submission for the Permit.io Authorization Challenge: Permissions Redefined What I Built I built Scrapebase - a web scraping service with tiered access controls that demonstrates API-first authorization using Permit.io. The project separates business logic from authorization concerns using Permit.io's policy-as-code approach. In many applications, authorization is implemented as an afterthought, resulting in security vulnerabilities and technical debt. Scrapebase demonstrates how to build with authorization as a first-class concern from day one. Key Features Tiered Service Levels: Free, Pro, and Admin tiers with different capabilities API Key Authentication: Simple authentication using API keys Role-Based Access Control: Permissions managed through Permit.io Domain Blacklist System: Resource-level restrictions for sensitive domains Text Processing: Basic and advanced text processing with role-based restrictions How It Works The core authentication and authorization flow: User sends request with x-api-key header permitAuth middleware intercepts the request Middleware maps API key to user role (free_user, pro_user, or admin) User is synced to Permit.io Permission check runs against Permit.io cloud PDP Request is allowed or denied based on policy decision ┌──────────┐ ┌───────────────┐ ┌────────────┐ ┌──────────────┐ │ Client │───▶│ Scrapebase API│───▶│permitAuth │───▶│ Permit.io │ │ │◀───│ │◀───│ middleware │◀───│ Cloud PDP │ └──────────┘ └───────────────┘ └────────────┘ └──────────────┘ │ ▲ │ │ └────────────────────────────────────────────────────────┘ Permission policies defined in Permit.io dashboard Demo You can test the API using the following endpoints: # Test with free user curl -X POST http://localhost:8080/api/processLinks \ -H "Content-Type: application/json" \ -H "x-api-key: 2025DEVChallenge_free" \ -d '{"url": "https://example.com"}' # Test with admin user curl -X POST http://localhost:8080/api/processLinks \ -H "Content-Type: application/json" \ -H "x-api-key: 2025DEVChallenge_admin" \ -d '{"url": "https://example.com", "advanced": true}' Project Repo 0xtamizh / scrapebase-permit-IO Scrapebase with Permit.io Authorization A powerful web scraping API with fine-grained authorization controls powered by Permit.io. This project demonstrates how to implement sophisticated authorization patterns in a real-world API service. Features Tiered Access Control: Different permissions for Free, Pro, and Admin users Resource-Based Authorization: Control access based on target domains Rate Limiting: Tier-specific rate limits enforced through policies Advanced Scraping Features: Premium capabilities restricted to Pro users Real-time Policy Updates: Changes to permissions take effect immediately Audit Logging: Track all authorization decisions Quick Start Clone the repository: git clone https://github.com/yourusername/scrapebase-permit cd scrapebase-permit Install dependencies: npm install Set up environment variables: cp .env.example .env Edit .env with your Permit.io API key and other configurations: PERMIT_API_KEY=your_permit_api_key ADMIN_API_KEY=2025DEVChallenge_admin USER_API_KEY=2025DEVChallenge_user Start the development server: npm run dev Visit http://localhost:3000 to access the testing UI Testing the Authorization Features Test Credentials Admin User: Username: admin API Key: 2025DEVChallenge_admin Regular… View on GitHub My Journey The Problem with Traditional Authorization Traditional approaches to authorization often result in permission checks scattered throughout application code, creating maintenance nightmares and security risks. When I started this project, I wanted to demonstrate how modern applications can embrace externalized authorization as a core architectural principle. I chose to build a web scraping service because it presents meaningful access control requirements: Tiered service levels that mirror real-world SaaS subscription models Administrative functions that require elevated permissions Resource-based restrictions through a domain blacklist system The Power of API-First Authorization The key insight that drove this project was the separation of concerns: business logic should be distinct from authorization decisions. By using Permit.io, I was able to: Define all permission policies in one place Enforce consistent access control across all endpoints Update policies without changing application code The implementation was straightforward - here's the core middleware that powers the authorization flow

This is a submission for the Permit.io Authorization Challenge: Permissions Redefined
What I Built
I built Scrapebase - a web scraping service with tiered access controls that demonstrates API-first authorization using Permit.io. The project separates business logic from authorization concerns using Permit.io's policy-as-code approach.
In many applications, authorization is implemented as an afterthought, resulting in security vulnerabilities and technical debt. Scrapebase demonstrates how to build with authorization as a first-class concern from day one.
Key Features
- Tiered Service Levels: Free, Pro, and Admin tiers with different capabilities
- API Key Authentication: Simple authentication using API keys
- Role-Based Access Control: Permissions managed through Permit.io
- Domain Blacklist System: Resource-level restrictions for sensitive domains
- Text Processing: Basic and advanced text processing with role-based restrictions
How It Works
The core authentication and authorization flow:
- User sends request with
x-api-key
header -
permitAuth
middleware intercepts the request - Middleware maps API key to user role (
free_user
,pro_user
, oradmin
) - User is synced to Permit.io
- Permission check runs against Permit.io cloud PDP
- Request is allowed or denied based on policy decision
┌──────────┐ ┌───────────────┐ ┌────────────┐ ┌──────────────┐
│ Client │───▶│ Scrapebase API│───▶│permitAuth │───▶│ Permit.io │
│ │◀───│ │◀───│ middleware │◀───│ Cloud PDP │
└──────────┘ └───────────────┘ └────────────┘ └──────────────┘
│ ▲
│ │
└────────────────────────────────────────────────────────┘
Permission policies defined in Permit.io dashboard
Demo
You can test the API using the following endpoints:
# Test with free user
curl -X POST http://localhost:8080/api/processLinks \
-H "Content-Type: application/json" \
-H "x-api-key: 2025DEVChallenge_free" \
-d '{"url": "https://example.com"}'
# Test with admin user
curl -X POST http://localhost:8080/api/processLinks \
-H "Content-Type: application/json" \
-H "x-api-key: 2025DEVChallenge_admin" \
-d '{"url": "https://example.com", "advanced": true}'
Project Repo
Scrapebase with Permit.io Authorization
A powerful web scraping API with fine-grained authorization controls powered by Permit.io. This project demonstrates how to implement sophisticated authorization patterns in a real-world API service.
Features
- Tiered Access Control: Different permissions for Free, Pro, and Admin users
- Resource-Based Authorization: Control access based on target domains
- Rate Limiting: Tier-specific rate limits enforced through policies
- Advanced Scraping Features: Premium capabilities restricted to Pro users
- Real-time Policy Updates: Changes to permissions take effect immediately
- Audit Logging: Track all authorization decisions
Quick Start
- Clone the repository:
git clone https://github.com/yourusername/scrapebase-permit
cd scrapebase-permit
- Install dependencies:
npm install
- Set up environment variables:
cp .env.example .env
Edit .env
with your Permit.io API key and other configurations:
PERMIT_API_KEY=your_permit_api_key
ADMIN_API_KEY=2025DEVChallenge_admin
USER_API_KEY=2025DEVChallenge_user
- Start the development server:
npm run dev
- Visit http://localhost:3000 to access the testing UI
Testing the Authorization Features
Test Credentials
Admin User:
- Username: admin
- API Key: 2025DEVChallenge_admin
Regular…
My Journey
The Problem with Traditional Authorization
Traditional approaches to authorization often result in permission checks scattered throughout application code, creating maintenance nightmares and security risks. When I started this project, I wanted to demonstrate how modern applications can embrace externalized authorization as a core architectural principle.
I chose to build a web scraping service because it presents meaningful access control requirements:
- Tiered service levels that mirror real-world SaaS subscription models
- Administrative functions that require elevated permissions
- Resource-based restrictions through a domain blacklist system
The Power of API-First Authorization
The key insight that drove this project was the separation of concerns: business logic should be distinct from authorization decisions. By using Permit.io, I was able to:
- Define all permission policies in one place
- Enforce consistent access control across all endpoints
- Update policies without changing application code
The implementation was straightforward - here's the core middleware that powers the authorization flow:
// Map API key to user role
switch (apiKey) {
case process.env.ADMIN_API_KEY:
userKey = '2025DEVChallenge_admin';
tier = 'admin';
break;
// ...other keys
}
// Sync user to Permit.io
await permit.api.syncUser({
key: userKey,
email: `${userKey}@scrapebase.xyz`,
attributes: { tier, roles: [tier] }
});
// Check permission
const action = req.body.advanced ? 'scrape_advanced' : 'scrape_basic';
const permissionCheck = await permit.check(user.key, action, 'website');
if (!permissionCheck) {
return res.status(403).json({
success: false,
error: 'Access denied by Permit.io'
});
}
Challenges Faced
Cloud PDP Limitations
Initially, I tried implementing Attribute-Based Access Control (ABAC) by passing resource attributes:
// This DIDN'T work with cloud PDP
const resource = {
type: 'website',
key: hostname,
attributes: {
is_blacklisted: isBlacklistedDomain
}
};
const permissionCheck = await permit.check(user.key, action, resource);
The cloud PDP returned 501 errors because it only supports basic RBAC. I had to simplify to a pure RBAC approach:
// This works with cloud PDP
const permissionCheck = await permit.check(user.key, action, resourceType);
Role Assignment
Another challenge was ensuring roles were properly synchronized and recognized. The solution was two-fold:
- Properly sync users with their role information
- Manually configure role permissions in the Permit.io dashboard
Using Permit.io for Authorization
Setting up Permit.io involved these key steps:
- Creating a project in the Permit.io dashboard
- Defining resources (
website
), actions (scrape_basic
,scrape_advanced
), and roles (free_user
,pro_user
,admin
) - Configuring the permission matrix in the dashboard
- Integrating the Permit.io SDK into my application
Here's the role-based capability matrix I implemented:
Feature | Free User | Pro User | Admin |
---|---|---|---|
Basic Scraping | ✅ | ✅ | ✅ |
Advanced Scraping | ❌ | ✅ | ✅ |
Text Cleaning | ✅ | ✅ | ✅ |
AI Summarization | ❌ | ✅ | ✅ |
View Blacklist | ✅ | ✅ | ✅ |
Manage Blacklist | ❌ | ❌ | ✅ |
Access Blacklisted Domains | ❌ | ❌ | ✅ |
Permission Enforcement
Permissions are enforced in two places:
- The
permitAuth
middleware for API endpoints:
const permissionCheck = await permit.check(user.key, action, 'website');
if (!permissionCheck) {
return res.status(403).json({ success: false, error: 'Access denied' });
}
- Directly in route handlers for specific features:
// src/routes/summarize.ts
if (summarize) {
const userTier = req.user?.attributes?.tier;
if (userTier !== 'pro_user' && userTier !== 'admin') {
return res.status(403).json({
success: false,
error: 'Access denied',
details: 'Text summarization is only available for Pro and Admin users'
});
}
}
What I Learned
Building Scrapebase with Permit.io taught me how to:
- Separate authorization concerns from business logic
- Implement role-based access control with external policy management
- Design a flexible permission system that doesn't require code changes to update policies
The advantages of this approach are clear:
- Separation of concerns: Business logic remains focused on core functionality while authorization is handled externally
- Adaptable policies: Permissions can be updated without code changes or redeployments
- Consistent enforcement: Authorization decisions follow the same rules across all application endpoints
- Improved security: Centralized policy management reduces the risk of inconsistent permission checks
- Developer experience: Cleaner codebase with reduced authorization-related complexity
This externalized approach enables business stakeholders to manage authorization policies directly through the Permit.io dashboard, while developers focus on building features - the hallmark of a well-designed API-first authorization system.
Future Improvements
With more time, I would:
- Set up a local PDP to enable ABAC with resource attributes
- Implement tenant isolation for multi-tenant support
- Add UI components in the admin dashboard to view permission audit logs
- Create more granular roles and permissions beyond the three tiers
- Add a user management section to assign roles through the UI
Scrapebase demonstrates how modern SaaS apps can delegate complex authorization to a specialized service like Permit.io, allowing developers to focus on core features while maintaining robust access controls.