SqlStorageClient
Hierarchy
- StorageClient
- SqlStorageClient
Index
Methods
__aenter__
Async context manager entry.
Returns SqlStorageClient
__aexit__
Async context manager exit.
Parameters
exc_type: type[BaseException] | None
exc_value: BaseException | None
exc_traceback: TracebackType | None
Returns None
__init__
Initialize the SQL storage client.
Parameters
optionalkeyword-onlyconnection_string: str | None = None
Database connection string (e.g., "sqlite+aiosqlite:///crawlee.db"). If not provided, defaults to SQLite database in the storage directory.
optionalkeyword-onlyengine: AsyncEngine | None = None
Pre-configured AsyncEngine instance. If provided, connection_string is ignored.
Returns None
close
Close the database connection pool.
Returns None
create_dataset_client
Create a dataset client.
Parameters
optionalkeyword-onlyid: str | None = None
optionalkeyword-onlyname: str | None = None
optionalkeyword-onlyalias: str | None = None
optionalkeyword-onlyconfiguration: Configuration | None = None
Returns DatasetClient
create_kvs_client
Create a key-value store client.
Parameters
optionalkeyword-onlyid: str | None = None
optionalkeyword-onlyname: str | None = None
optionalkeyword-onlyalias: str | None = None
optionalkeyword-onlyconfiguration: Configuration | None = None
Returns KeyValueStoreClient
create_rq_client
Create a request queue client.
Parameters
optionalkeyword-onlyid: str | None = None
optionalkeyword-onlyname: str | None = None
optionalkeyword-onlyalias: str | None = None
optionalkeyword-onlyconfiguration: Configuration | None = None
Returns RequestQueueClient
create_session
Create a new database session.
Returns AsyncSession
A new AsyncSession instance.
get_accessed_modified_update_interval
Get the interval for accessed and modified updates.
Returns timedelta
get_dialect_name
Get the database dialect name.
Returns str | None
get_rate_limit_errors
Return statistics about rate limit errors encountered by the HTTP client in storage client.
Returns dict[int, int]
get_storage_client_cache_key
Return a cache key that can differentiate between different storages of this and other clients.
Can be based on configuration or on the client itself. By default, returns a module and name of the client class.
Parameters
configuration: Configuration
Returns Hashable
initialize
Initialize the database schema.
This method creates all necessary tables if they don't exist. Should be called before using the storage client.
Parameters
configuration: Configuration
Returns None
Properties
engine
Get the SQLAlchemy AsyncEngine instance.
SQL implementation of the storage client.
This storage client provides access to datasets, key-value stores, and request queues that persist data to a SQL database using SQLAlchemy 2+. Each storage type uses two tables: one for metadata and one for records.
The client accepts either a database connection string or a pre-configured AsyncEngine. If neither is provided, it creates a default SQLite database 'crawlee.db' in the storage directory.
Database schema is automatically created during initialization. SQLite databases receive performance optimizations including WAL mode and increased cache size.
This is an experimental feature. The behavior and interface may change in future versions.