Driver for Oracle Database v0.6.0-alpha.1¶
Driver Version v0.6.0-alpha.1 Tested With Oracle Database 23ai Tested With Oracle Database 26ai
Warning
This is a prerelease version.
This driver provides access to Oracle Database (commonly referred to as just “Oracle”).
Note
This project is not affiliated with Oracle.
Installation & Quickstart¶
Important
This driver is available from Columnar’s private driver registry. Before installing it, create an account at cloud.columnar.tech and click to activate a 14-day free trial. Then authenticate to the registry:
dbc auth login
The driver can be installed with dbc :
dbc install oracle --pre
To use the driver:
Install the Oracle Instant Client libraries.
Set
LD_LIBRARY_PATH,DYLD_LIBRARY_PATH, or otherwise make sure the Oracle Instant Client libraries are discoverable by your application.Provide the connection string as the
uriparameter.
Note
It is highly recommended to explicitly set timezone="UTC" to avoid ambiguity
in local timestamp values (if used).
Note
It is highly recommended to use the latest Oracle Instant Client release.
Connecting¶
To use the driver, provide an Oracle connection string as the uri option. The driver supports URI format and DSN-style connection strings, but URIs are recommended.
from adbc_driver_manager import dbapi
dbapi.connect(
driver="oracle",
db_kwargs={
"uri": "oracle://user:password@localhost:1521/FREEPDB1"
}
)
Note: The example above is for Python using the adbc-driver-manager package but the process will be similar for other driver managers. See adbc-quickstarts .
Connection String Format¶
Oracle URI syntax:
oracle://[user[:password]@]host[:port][/serviceName][?param1=value1¶m2=value2]
This follows the Go Oracle Driver Connection String format.
Components:
scheme:oracle://(required)user/password: (optional) For standard database authentication.host: (required) The hostname or IP address of the Oracle database server.port: (optional) Defaults to 1521.serviceName: (optional) The Service Name or SID, used by the Oracle Client to identify the target database instance. Also can use full Oracle Easy Connect Plus identifier format, such as/serviceName:serverType/instanceName. See Oracle Database Easy Connect PlusQuery Parameters: Additional configuration options. For a complete list of parameters, see the Go Oracle Driver Connection Parameters
Note
Reserved characters in URI elements must be URI-encoded. For example, @ becomes %40.
Examples:
oracle://scott:tiger@salesserver1/sales.us.example.com?poolSessionTimeout=42soracle://dbhost.local/PROD_PDBoracle://user@10.0.0.5:1521/svc?configDir=/tmp/adminoracle://user:p%40ssword@host/service
This driver also supports other types of connection strings that are supported by the Go Oracle Driver .
Feature & Type Support¶
| Feature | Oracle Database | |
|---|---|---|
| Bulk Ingestion | Create | ✅ |
| Append | ✅ | |
| Create/Append | ✅ | |
| Replace | ✅ | |
| Temporary Table | ❌ | |
| Target Catalog | ❌ | |
| Target Schema | ✅ | |
| Non-nullable fields are marked NOT NULL | ✅ | |
| Catalog (GetObjects) | depth=catalogs | ✅ |
| depth=db_schemas | ✅ | |
| depth=tables | ✅ | |
| depth=columns (all) | ✅ | |
| Get Parameter Schema | ❌ | |
| Get Table Schema | ✅ | |
| Prepared Statements | ✅ | |
| Transactions | ✅ | |
Types¶
Database to Arrow¶
| Database Type | Oracle Database |
|---|---|
|
BINARY_DOUBLE |
double |
|
BINARY_FLOAT |
float |
|
BLOB |
large_binary [1] |
|
BOOLEAN |
bool [2] |
|
CHAR |
string [3] |
|
CLOB |
large_string [3] |
|
DATE |
date32[day] |
|
DOUBLE PRECISION |
double |
|
INT |
decimal128(38, 0) |
|
INTERVAL DAY TO SECOND (redo.query.day_time_interval_type = duration) |
duration[s] [4] |
|
INTERVAL DAY TO SECOND (redo.query.day_time_interval_type = monthdaynano) |
|
|
INTERVAL DAY TO SECOND (redo.query.day_time_interval_type = string) |
|
|
INTERVAL YEAR TO MONTH (redo.query.year_month_interval_type = monthdaynano) |
month_day_nano_interval [9] |
|
INTERVAL YEAR TO MONTH (redo.query.year_month_interval_type = string) |
string [10] |
|
JSON |
extension<arrow.json> [11] |
|
LONG |
large_string [3] |
|
LONG RAW |
large_binary |
|
NCLOB |
large_string [3] |
|
NUMBER (redo.query.unbounded_number_type = float64) |
double |
|
NUMBER (redo.query.unbounded_number_type = int64) |
int64 |
|
NUMBER (redo.query.unbounded_number_type = string) |
string |
|
NUMBER(p, 0) (p <= 18) |
int64 |
|
NUMBER(p, s) (s != 0, p > 18, redo.use_high_precision = false) |
double [12] |
|
NUMBER(p, s) (s != 0, p > 18, redo.use_high_precision = true) |
decimal128 [12] |
|
NUMBER(p, s), NUMBER(p), NUMBER(*, s) |
decimal128 |
|
OBJECT |
extension<arrow.json> [11] |
|
RAW |
binary |
|
REAL |
double |
|
ROWID, UROWID |
string |
|
SDO_GEOMETRY |
extension<geoarrow.wkb> |
|
SMALLINT |
decimal128(38, 0) |
|
TIMESTAMP |
timestamp[ns], timestamp[us] ⚠️ [13] |
|
TIMESTAMP WITH LOCAL TIME ZONE |
timestamp[us] (with time zone) [14] |
|
TIMESTAMP WITH TIME ZONE |
timestamp[ns] (with time zone), timestamp[us] (with time zone) ⚠️ [13] |
|
VARCHAR2 |
string [3] |
|
VARRAY |
extension<arrow.json> [11] |
|
XMLType |
extension<arrow.opaque[storage_type=string, type_name=XMLType, vendor_name=Oracle Database]> [11] [3] |
Arrow to Database¶
| Arrow Type | Oracle Database Type | |
|---|---|---|
| Bind | Ingest | |
|
binary |
RAW |
BLOB |
|
binary_view |
RAW |
BLOB |
|
bool |
BOOLEAN |
|
|
date32[day] |
DATE |
|
|
decimal128 |
NUMBER(p, s) |
|
|
double |
BINARY_DOUBLE |
|
|
extension<arrow.json> |
JSON |
(NA/not tested) |
|
extension<geoarrow.wkb> |
(NA/not tested) |
SDO_GEOMETRY [15] |
|
extension<geoarrow.wkt> |
(NA/not tested) |
SDO_GEOMETRY [15] |
|
fixed_size_binary |
RAW |
BLOB |
|
float |
BINARY_FLOAT |
|
|
halffloat |
BINARY_FLOAT |
(NA/not tested) |
|
int16 |
NUMBER(38, 0) |
NUMBER(5, 0) |
|
int32 |
NUMBER(38, 0) |
NUMBER(10, 0) |
|
int64 |
(NA/not tested) |
NUMBER(19, 0) |
|
large_binary |
RAW |
BLOB |
|
large_string |
NVARCHAR2 |
NVARCHAR2(2000) [3] |
|
list |
(NA/not tested) |
NVARCHAR2, JSON, BLOB |
|
month_day_nano_interval |
(NA/not tested) |
INTERVAL DAY(9) TO SECOND(9), INTERVAL YEAR(9) TO MONTH |
|
null |
(NA/not tested) |
NUMBER |
|
string |
NVARCHAR2 |
NVARCHAR2(2000) [3] |
|
string_view |
RAW |
NVARCHAR2(2000) [3] |
|
struct |
(NA/not tested) |
NVARCHAR2, BLOB, JSON |
|
timestamp[ms] |
TIMESTAMP(3) |
|
|
timestamp[ms] (with time zone) |
TIMESTAMP(3) WITH TIME ZONE |
|
|
timestamp[ns] |
TIMESTAMP(9) |
|
|
timestamp[ns] (with time zone) |
TIMESTAMP(9) WITH TIME ZONE |
|
|
timestamp[s] |
TIMESTAMP(0) |
|
|
timestamp[s] (with time zone) |
TIMESTAMP(0) WITH TIME ZONE |
|
|
timestamp[us] |
TIMESTAMP(6) |
|
|
timestamp[us] (with time zone) |
TIMESTAMP(6) WITH TIME ZONE |
|
Catalog Metadata/GetObjects behavior¶
Oracle Database does not quite support the standard SQL catalog.schema.table
three-part namespace. Instead, a schema.table@namespace syntax is
supported, with the namespace being a multitenant container database (CDB),
pluggable database (PDB), or DBLink. Currently, the ADBC driver will surface
the currently attached PDB as the “catalog” and list schemas as children of
that catalog. Future improvements may surface DBLinks as catalog as well.
See also
Oracle AI Database 26ai Technical Architecture describing CDBs and PDBs
Implicit Result Sets¶
Currently, queries that use DBMS_SQL.RETURN_RESULT are supported, but only
if execution_mode is batch (see below), and only with a single cursor. The
following query:
DECLARE
c SYS_REFCURSOR;
BEGIN
OPEN c FOR SELECT 1 AS RESULT FROM DUAL;
DBMS_SQL.RETURN_RESULT(c);
END;
Will have a result set of 1.
The following query has the same result set, despite the extra cursor:
DECLARE
c SYS_REFCURSOR;
d SYS_REFCURSOR;
BEGIN
OPEN c FOR SELECT 1 AS RESULT FROM DUAL;
OPEN d FOR SELECT 'HELLO WORLD' AS RESULT FROM DUAL;
DBMS_SQL.RETURN_RESULT(c);
DBMS_SQL.RETURN_RESULT(d);
END;
This is due to a limitation in the current ADBC APIs which we are working to resolve (the proposal for multiple result sets is accepted, but not implemented or released).
OUT and IN/OUT Parameters¶
Oracle stored procedures can have OUT and IN/OUT parameters. Since ADBC uses the Arrow C Data Interface (which provides immutable data), such parameter values cannot be returned through bind parameters. Instead, they are returned as extra columns appended to the result schema.
To use OUT parameters, bind a field with the metadata key
ORACLE:parameter_type set to out (or inout):
cursor.execute(
"""
CREATE OR REPLACE PROCEDURE greet(name IN VARCHAR2, greeting OUT VARCHAR2) AS
BEGIN
greeting := 'Hello, ' || name;
END;
"""
)
params = pyarrow.record_batch(
{"name": ["World"], "greeting": [None]},
schema=pyarrow.schema([
pyarrow.field("name", pyarrow.string()),
pyarrow.field("greeting", pyarrow.string(), metadata={
"ORACLE:parameter_type": "out",
}),
]),
)
cursor.execute("BEGIN greet(:1, :2); END;", params)
print(cursor.fetchall()) # [('Hello, World',)]
The first row of the result will contain OUT parameter values; subsequent rows (if any, from a result set) contain NULL for the OUT columns.
Note
ExecuteUpdate does not support OUT/INOUT parameters. Use ExecuteQuery
(or the DBAPI execute method) instead.
Timezones¶
The Oracle Instant Client depends on the client and server having the exact same “Time Zone File” to resolve timezone names. If these are not matched, you may get errors like the following for queries that reference timezones:
Could not read row: ORA-01805: possible error in date/time operation
You can avoid this by using zone offsets (like -05:00) instead of timezone
names. You can also configure the ORA_TZFILE environment variable so that
this file is the same between client and server. Unfortunately, this is a
global setting and cannot be set per connection[16]. To learn more,
including how to configure this environment variable, read the Datetime Data
Types and Time Zone Support section of the “Database
Globalization Support Guide” in the Oracle Database documentation,
particularly section 4.6, “Choosing a Time Zone File”.
Memory and Performance Tuning¶
When reading data, we recommend the following for best performance:
Set
redo.statement.execution_modetobatch. (This will be the default in a future version.)Set
redo.statement.fetch_array_size. If there is little or no latency between your client and the database, then a low value (in the tens to hundreds of rows) is best. If there is high latency, then a higher value (in the thousands of rows) is best. We recommend benchmarking your specific setup and queries to learn the best value for this parameter.
Memory usage, however, is directly proportional to fetch_array_size. If
memory usage is a concern, then a lower value is recommended. (The Oracle SDK
pre-allocates memory to hold fetch_array_size rows, hence increasing fetch
size also increases allocation size.)
See also
Options¶
Options Affecting Queries¶
redo.query.day_time_interval_typeValues:
auto,duration,monthdaynano, orstring. Default:auto.Controls how to handle INTERVAL DAY TO SECOND types in Oracle. The behaviors are as follows:
Mode
Arrow type (SECOND part is millisecond or coarser)
Arrow type (SECOND part is sub-millisecond)
autoday-second interval
month-day-nano interval
duration [17]
duration (second or millisecond)
duration (microsecond or nanosecond)
monthdaynano
month-day-nano interval
month-day-nano interval
string
string (ISO 8601 duration)
string (ISO 8601 duration)
monthdaynanois provided to help PyArrow users, as PyArrow does not support other interval types . Similarly,durationandstringare provided as fallback options for systems that cannot handle Arrow interval types.redo.query.year_month_interval_typeValues:
auto,monthdaynano, orstring. Default:auto.Controls how to handle INTERVAL YEAR TO MONTH types in Oracle. The behaviors are as follows:
Mode
Arrow Type
auto
year-month interval
monthdaynano
month-day-nano interval
string
string (ISO 8601 duration)
monthdaynanois provided to help PyArrow users, as PyArrow does not support other interval types .stringis provided as a general fallback for systems that cannot handle Arrow interval types.Note that in
autoormonthdaynanomodes, INTERVAL YEAR(9) TO MONTH may overflow for extreme values, as all Arrow interval types store the year/month part of the interval as a 32-bit signed integer number of months.stringwill not overflow.redo.query.unbounded_number_typeValues:
float64,string, orint64. Default:float64.Controls how to handle NUMBER with no specified precision and scale. This occurs for
NUMBER,REAL, andDOUBLE PRECISIONtypes. This also occurs with certain aggregation operators likeCOUNTandSUM:>>> cur.execute("SELECT SUM(1.01) FROM DUAL").fetchallarrow().schema SUM(1.01): double -- field metadata -- oracle.type-name: 'NUMBER(0,0)' >>> cur.execute("SELECT COUNT(*) FROM DUAL").fetchallarrow().schema COUNT(*): double -- field metadata -- oracle.type-name: 'NUMBER(0,0)'
Unfortunately, this means that the driver cannot differentiate between integer-valued NUMBER and floating-point NUMBER in this scenario.
redo.use_extension_typeType: boolean. Default: false
Annotate result set column types with Arrow extension types . For example, when enabled, JSON columns will be returned as Arrow string columns with the JSON canonical extension type metadata . When not enabled (the default), JSON columns will simply be Arrow string columns.
This is disabled by default because many Arrow implementations will completely change the representation of such columns/types when extension type metadata is seen, requiring special handling for those types.
Example:
>>> cur.execute("""SELECT CAST('{"foo": "bar"}' AS JSON) AS RESULT""").fetchallarrow().schema RESULT: string -- field metadata -- oracle.type-name: 'JSON' >>> cur.adbc_statement.set_options(**{"redo.use_extension_type": "true"}) >>> cur.execute("""SELECT CAST('{"foo": "bar"}' AS JSON) AS RESULT""").fetchallarrow().schema RESULT: extension<arrow.json> -- field metadata -- oracle.type-name: 'JSON'
redo.use_high_precisionType: boolean. Default: true
Controls how to handle NUMBER types in Oracle. When enabled (the default), bounded NUMBER types will be returned as Arrow decimal128 (precision <= 38) or decimal256 (precision > 38). When disabled, or when the NUMBER type has an unknown or unbounded precision, they will be returned as Arrow float64.
Note
NUMBER types may not have a precision specified, in which case they will always be returned as Arrow float64, regardless of this setting. Additionally, NUMBER types where the scale is 0 and the precision is <= 18 will always be returned as Arrow int64, again regardless of this setting.
Example:
>>> cur.execute("""SELECT CAST('1.0' AS NUMBER(10, 2)) AS RESULT""").fetchallarrow().schema RESULT: decimal128(10, 2) -- field metadata -- oracle.type-name: 'NUMBER(10,2)' >>> cur.adbc_statement.set_options(**{"redo.use_high_precision": "false"}) >>> cur.execute("""SELECT CAST('1.0' AS NUMBER(10, 2)) AS RESULT""").fetchallarrow().schema RESULT: double -- field metadata -- oracle.type-name: 'NUMBER(10,2)'
redo.statement.execution_modeValues:
roworbatch. Default:row.Controls how to parse the query result internally.
row(the current default) parses values row-by-row.batchparses batches of values.We recommend
batchand expect it to always be on-par or faster. It will become the default in version 0.7.0 of the driver, androwis planned to be removed in version 0.8.0. Please report any bugs you may encounter.Warning
Some types may overflow their internal representations in
rowmode, and the driver cannot detect this. For this reason and others, it is highly recommended to usebatch.redo.statement.prefetch_bytesandredo.statement.prefetch_rowsType: positive integer. Default: unspecified (
prefetch_bytes); 1 (prefetch_rows).How many rows or bytes of data to fetch on initial query execution. If you know approximately how much data a query will return, setting this can improve performance by avoiding subsequent round-trips to the database server. For
prefetch_bytes, the SDK will fetch as many complete rows as can fit in the specified memory.If both are set, then whichever limit is reached first by the SDK will take effect.
redo.statement.fetch_array_sizeType: positive integer. Default: unspecified.
How many rows of data to fetch on each server round-trip. If the connection is high-latency, then increasing this is highly recommended to reduce the number of round-trips.
redo.statement.read_batch_approx_bytesType: nonnegative integer. Default: 0 (no limit)
An approximate limit on the memory size of each returned Arrow batch. This limit is best-effort and individual batches may exceed (or undershoot) the limit.
redo.statement.read_batch_rowsType: positive integer. Default: 65536
A hard limit on the number of rows in each returned Arrow batch.
Options Affecting Bulk Ingest¶
redo.statement.insert_batch_rowsType: positive integer. Default: 65536
How many rows to insert on each call to the database.
redo.ingest.struct_typeValues:
blob,json,nvarchar2,object. Default:objectWhat database type to ingest Arrow struct, list, fixed size list, and list view types as.
blob: create BLOB CHECK IS JSON columns containing JSONnvarchar2: create NVARCHAR2 CHECK IS JSON columns containing JSONjson: create JSON columns (only on Oracle Database 21c and newer)object: create OBJECT columns (not currently implemented)
redo.ingest.tablespaceType: string. Default: not set
Create the table to ingest into in the specified tablespace .
FAQ/Troubleshooting¶
Canceled queries still show up in v$session as INACTIVE¶
This appears to be a quirk of Oracle. While completed queries will result in v$session.sql_id returning to NULL, cancelled queries will remain in sql_id until another query is executed. You can observe this by starting a query, and instead of pulling data, using ALTER SYSTEM CANCEL SQL to cancel the query on the backend. The query will remain in v$session, but trying to fetch data will result in ORA-01013: User requested cancel of current operation, indicating the query was indeed cancelled.
Compatibility¶
This driver was tested on:
Oracle Database
23.5.0.24.7 [Oracle Database 23ai Free Release 23.0.0.0.0 - Develop, Learn, and Run for Free; Version 23.5.0.24.07] - Instant Client 23.26.0.0.0 - godror v0.0.0-20260521001602-fcb4d984bcab+ODPI-5.6.4Oracle Database
23.26.1.0.0 [Oracle AI Database 26ai Free Release 23.26.1.0.0 - Develop, Learn, and Run for Free; Version 23.26.1.0.0] - Instant Client 23.26.0.0.0 - godror v0.0.0-20260521001602-fcb4d984bcab+ODPI-5.6.4
Previous Versions¶
To see documentation for previous versions of this driver, see the following: