-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Most BCP improvements from NCBI, as of May 2024 #558
base: master
Are you sure you want to change the base?
Conversation
e6facc4
to
7b71bdb
Compare
I've force-pushed a corrected version superseding #570. |
Meanwhile, I'm looking to extend this series with BCP-specific tuneups superseding #573. To that end, I wanted to compare behavior for three cases: normal BCP in, BCP into an alternate version of the table where column types are all On the DBLIB front, I found that all three cases fully succeeded when built against this branch, but
Any idea what I might be missing? I took care to predefine UPDATE: Those errors turned out to stem from the use of |
Here are my summarized comparison results (comparing against version 16.0 of Sybase's client libraries), where CTLIB BCP truncation differences w/CS layer messages ignored (see below for corrected version):
DBLIB BCP truncation differences (also corrected below):
|
I'll turn back to other PRs for now, but do still plan to add a proper replacement for #573 at some point. |
Ah, |
CTLIB BCP truncation differences w/CS layer messages printed, and more corner cases examinedGeneral notes
Notably, all errors under Sybase officially come from the CT/BLK layer, whereas FreeTDS only ever issues CS-layer errors that offer less context and may be likelier to slip through the cracks. That should be possible to remedy by letting |
Corrected DBLIB truncation differences (modulo formatting details along the above lines and FreeTDS's support for additional client types):
|
7b71bdb
to
468812f
Compare
When the user erroneously supplies NULL data for a column which does not use a nullable type and is not otherwise known to be nullable, cleanly fail. Pressing on anyway (in hopes that a server-side default would take effect?) was liable to trigger legitimate assertion failures in tds_checks.c. To that end, take the null_error callback and invoke it before bailing unless it is itself NULL. Signed-off-by: Aaron M. Ucko <[email protected]>
* Read defaults, send them back explicitly as needed, and clean up their storage afterwards. * tds5_bcp_add_variable_columns: ** Send spaces in lieu of fully empty SYB(VAR)CHAR data. ** Send correct textpos values in all cases. ** When writing the adjustment table, avoid a potential Sybase "row format error" by eliding the (effective) column count when it would have repeated the largest adjustment table entry. * tds_deinit_bcpinfo: Free and reset current_row for transfers in. Avoid potentially trying to interpret fragments of outgoing TDS data as BLOB pointers. Signed-off-by: Aaron M. Ucko <[email protected]>
Under these protocol versions, MS SQL Server describes these columns as having NVARCHAR(n) types rather than traditional DATETIME types, due to DATETIME's range limitations. (Likewise for TIME, which however still needs to come in as a preconverted UTF-16 string to compensate for combined API and protocol limitations.) - ctlib/blk.c: Perform character conversion on strings converted from fixed-width types such as CS_DATETIME_TYPE when a comparison of column-size limits appears to indicate an encoding difference. - tds/config.c: Adjust STD_DATETIME_FMT to match MS's (ISO-ish) tastes; in particular, ensure that the entire date portion fits within the first 10 characters. Signed-off-by: Aaron M. Ucko <[email protected]>
* Extend _cs_convert to take an additional handle argument (allowed to be NULL) to provide for the possibility of replacing the caller's original output buffer with the one tdsconvert allocated itself (as it always does for blobs), rather than copying the blob data. * blk.c (_blk_get_col_data): Provide a non-NULL handle, both for efficiency's sake and because tds_alloc_bcp_column_data imposes a 4 KiB cap with the expectation that callers will take this approach. (dblib already does.) Signed-off-by: Aaron M. Ucko <[email protected]>
To that end, factor a static _blk_clean_desc function out of blk_done's CS_BLK_ALL case. Signed-off-by: Aaron M. Ucko <[email protected]>
* tds.h (struct tds_bcpinfo): Add more fields to enable the necessary bookkeeping. * ctlib/blk.c: Implement blk_textfer; adjust other functions accordingly. * tds/{bulk,data}.c: Allow for deferred, and possibly piecemeal, blob submission; in particular, make tds_bcp_send_record resumable. Signed-off-by: Aaron M. Ucko <[email protected]>
* Allow _cs_convert to accept an explicit TDS destination type; direct it to copy cres.dta if that destination type is SYBMS{DATE,TIME}*. * _blk_get_col_data: Don't give _cs_convert_not_client any data to convert (blindly in the wrong direction!); rather, if it indicates that the column holds a new MS type, supply an explicit TDS destination type to _cs_convert. * _cs_dt_crack_v2: Formally populate datetzone from dr.timezone (still always 0 in practice). Signed-off-by: Aaron M. Ucko <[email protected]>
Signed-off-by: Aaron M. Ucko <[email protected]>
Adjust existing failure modes' boundaries and introduce the possibility of proceeding with truncated input with or without a message (with the latter occurring in some cases that require no conversion). As for messages, arrange to emit them via _ctclient_msg rather than _csclient_msg with the same (English) wording and context as Sybase yields, modulo formal punctuation differences and the usual discrepancies in layer and origin numbering. To that end: * Add an internal BLK_CONV_STATUS enum type whose values double as (new) message numbers. * Cover those numbers in _ct_get_user_api_layer_error. * Have _cs_convert additionally take a (quite possibly NULL) pointer to a bulk-conversion-status value to populate as appropriate. * Keep a running count of rows sent for reporting purposes (determining the column number to cite by a linear scan of the bindinfo's columns array). Signed-off-by: Aaron M. Ucko <[email protected]>
The destination view may well have an INSTEAD OF INSERT trigger that makes meaningful use of the supplied values. Signed-off-by: Aaron M. Ucko <[email protected]>
Store a copy of the hints; clean it up in _blk_clean_desc. Signed-off-by: Aaron M. Ucko <[email protected]>
Cast away harmless discrepancies, mostly by taking advantage of TDS_PUT_BYTE. Signed-off-by: Aaron M. Ucko <[email protected]>
468812f
to
8739305
Compare
My latest version of this series adds a commit to address a corner case I recently encountered by (re)allowing BCP into computed columns in the context of a |
Picking up a commit to replace one work I was doing, specifically ae519f8. |
This PR roughly corresponds to the primarily BCP-related commits from #555, adding 02aa63c per dependency considerations and omitting a couple of commits that can receive individual PRs, which I'll open shortly.