1
0
Fork 0
mirror of https://github.com/ultrajson/ultrajson.git synced 2024-06-02 14:46:04 +02:00
Commit Graph

119 Commits

Author SHA1 Message Date
Eugene Toder eda5ecd2c2 Speed-up and cleanup objToJSON
* Use PyDict_Next() to iterate over dicts.
* Use macros to access lists, tuples, bytes.
* Avoid calling PyErr_Occurred() if not necessary.
* Fix a memory leak when encoding very large ints.
* Delete dead and duplicate code.

Also,

* Raise TypeError if toDict() returns a non-dict instead of silently
  converting it to null.
2023-12-10 21:11:20 +00:00
Eugene Toder a08b75b970 Use lowercase strings for bool dict keys
Fixes #613

Also,

* Consolidate key conversion for sorted and unsorted cases.
* Fix memory leak of the "null" string when handling None dict key.
2023-11-27 17:26:17 +00:00
Brénainn Woodsend b18f60d31f Fix encoding of infinity (#80).
Infinity was being encoded as 'Inf' which, whilst the JSON spec doesn't include
any non-finite floats, differs from the conventions in other JSON libraries,
JavaScript of using 'Infinity'. It also differs from what `ujson.loads()`
expects so that `ujson.loads(ujson.dumps(math.inf))` raises an exception.

Closes #80.
2022-08-08 22:37:52 +01:00
Tim Gates 264f60c018 docs: Fix a few typos
There are small typos in:
- python/objToJSON.c
- tests/test_ujson.py

Fixes:
- Should read `standard` rather than `stanard`.
- Should read `gibberish` rather than `jibberish`.

Signed-off-by: Tim Gates <tim.gates@iress.com>
2022-07-20 09:59:57 +01:00
JustAnotherArchivist 8a946e5830 Add separators encoding parameter
Closes #283
2022-07-11 00:43:29 +01:00
JustAnotherArchivist aa068e335f Add support for arbitrary size integers 2022-06-16 17:26:19 +00:00
JustAnotherArchivist 666d159db8 Fix memory leak on encoding errors when the buffer was resized
`JSON_EncodeObject` returns `NULL` when an error occurs, but without freeing the buffer. This leads to a memory leak when the buffer is internally allocated (because the caller's buffer was insufficient or none was provided at all) and any error occurs. Similarly, `objToJSON` did not clean up the buffer in all error conditions either.

This adds the missing buffer free in `JSON_EncodeObject` (iff the buffer was allocated internally) and refactors the error handling in `objToJSON` slightly to also free the buffer when a Python exception occurred without the encoder's `errorMsg` being set.
2022-06-04 19:32:56 +00:00
JustAnotherArchivist 59aa3bf40e Fix bytesObj not getting assigned and DECREFd, resulting in a memory leak 2022-05-30 19:23:13 +00:00
JustAnotherArchivist 98321fad98 Switch to NULL encoding (= UTF-8) to avoid string comparison in PyUnicode_AsEncodedString 2022-05-30 01:58:58 +00:00
JustAnotherArchivist 9b9af1ab70 Fix handling of surrogates on encoding
This allows surrogates anywhere in the input, compatible with the json module from the standard library.

This also refactors two interfaces:
- The `PyUnicode` to `char*` conversion is moved into its own function, separated from the `JSONTypeContext` handling, so it can be reused for other things in the future (e.g. indentation and separators) which don't have a type context.
- Converting the `char*` output to a Python string with surrogates intact requires the string length for `PyUnicode_Decode` & Co. While `strlen` could be used, the length is already known inside the encoder, so the encoder function now also takes an extra `size_t` pointer argument to return that and no longer NUL-terminates the string. This also permits output that contains NUL bytes (even though that would be invalid JSON), e.g. if an object's `__json__` method return value were to contain them.

Fixes #156
Fixes #447
Fixes #537
Supersedes #284
2022-05-30 01:58:12 +00:00
JustAnotherArchivist b3f8754c8a Fix segmentation faults when handling unserialisable objects
Errors during `__repr__` itself as well as ones during the conversion to a bytes object were not handled, resulting in NULL pointer dereferencing.

Cf. #382
2022-04-18 12:20:18 +01:00
JustAnotherArchivist 935fe0cec4 Fix segmentation fault when an exception is raised while converting a dict key to a string
Fixes #522
2022-04-13 00:04:24 +00:00
JustAnotherArchivist 62dec8de71 Fix ref counting on non-string dict keys
For bytes, there was an extraneous INCREF; PyIter_Next returns a new reference. For other non-strings, the original itemName before converting to a string was never dereferenced.

Fixes #419
2022-04-07 20:31:36 +01:00
JustAnotherArchivist 2d1f088c2e Fix ref counting on repeated default function calls
Fixes #523
2022-04-07 20:20:01 +01:00
RouquinBlanc e6dc25cf12 simplify exception handling on integer overflow 2022-02-20 11:01:24 +00:00
JustAnotherArchivist f9aa23b5e6 Remove dead code that used to handle the separate int type in Python 2 2022-02-20 10:59:11 +00:00
JustAnotherArchivist 4bd21e2483 Fix exceptions on encoding list or dict elements and non-overflow errors on int handling getting silenced
Fixes #273
2022-02-16 08:17:47 +00:00
garenchan b7fba98136 Add a default keyword argument to dumps
dump and dumps functions in python json stdlib have a default keyword argument.
It's useful for serializing complex objects. Supporting this argument will improve compatibility and flexibility of ujson.
2021-09-06 09:55:37 +08:00
Dr. Nick e00caaebd5 dconv no longer uses global instances of StringToDoubleConverter/DoubleToStringConverter 2021-08-03 10:17:10 -04:00
David W.H. Swenson af699c3cd0
Match Python json output for exponents 2020-11-11 14:41:51 +01:00
Mark Bishop 7687b3de7a Fix dealing with None types 2020-10-30 16:48:27 -04:00
Chen-Han Hsiao (Stanley) 5b979eeebf Fix indent and add test case 2020-09-17 22:41:00 +08:00
Maxwell Bernstein 4e3a86791e Make ujson PEP-384 compliant
Add a function to check if an object is of type `decimal.Decimal`.
Since that type was previously cached as a static variable, this commit
makes it a member of the module state instead. Add the associated module
state machinery.

Only enable compact ASCII shortcut in non Limited API.

Also check if the module exists before creating it anew in the init
function.

Also remove unnecessary and leaky Py_INCREF. PyObject_GetAttrString
returns a new reference.

See PEP 384 (Defining a Stable ABI):
https://www.python.org/dev/peps/pep-0384/ and PEP 3121 (Extension Module
Initialization and Finalization):
https://www.python.org/dev/peps/pep-3121/
2020-08-12 13:08:08 -07:00
Hugo 5f1e8479fa Lint trailing-whitespace 2020-05-12 09:21:45 +03:00
Eric Le Lay 417b275b3c make reject_bytes=True the default 2020-05-08 17:57:46 +02:00
Eric Le Lay 9030207900 fix for python3 2020-05-08 17:38:13 +02:00
Eric Le Lay e0c113e6a2 Merge branch 'master' into 264-reject_bytes 2020-05-08 17:34:35 +02:00
Hugo van Kemenade b08ea47fb9
Merge branch 'master' into add_nan_support 2020-05-03 20:49:20 +03:00
Hugo ff8e64caa1 Drop support for EOL Python 2 2020-04-20 20:09:12 +03:00
Hugo van Kemenade d9ca1c9b5b
Merge branch 'master' into add_nan_support 2020-03-27 21:41:33 +02:00
Sami Salonen e5ecb9d240 itemNameTmp needed also for py2. 2020-03-21 11:01:03 +02:00
Sami Salonen 92c57b4210 Decrease dict key reference.
Changeset c9f8318 changed how dict items are iterated. As a consequence,
a reference to a dict key remains - clear it.
2020-03-21 00:17:22 +02:00
Hugo b494f73f5e Fix Python 3 encoding of 'X is not JSON serializable' 2020-03-13 15:26:28 +02:00
Richard Frank 36089a578e Fix reference counting bug for dict values
which meant a memory leak.

PyObject_GetItem returns a new reference (and goes through
abstract object[key] API), whereas PyDict_GetItem returns a borrowed
reference and goes directly to the dict hash lookup.
2020-03-02 16:51:03 -05:00
Hugo van Kemenade cfb597de39
Merge pull request #257 from borman/master
Fix a couple of memory leaks.
2020-03-02 23:42:13 +02:00
Hugo van Kemenade 631850788d
Merge branch 'master' into add_nan_support 2020-02-25 22:34:37 +02:00
Hugo van Kemenade 1588690257
Merge branch 'master' into 264-reject_bytes 2020-02-25 22:28:14 +02:00
Mark Guzman fe0e88d345
adding an allow_nan keyword argument to dumps defaulted to True
with this ujson matches the builtin json behavior for NaN and Inf.
if a user wants to retain the old behavior they can pass allow_nan=False
to ensure strict json compatibility.
2019-02-20 09:50:05 -05:00
Hugo 054c0b7a34 Drop EOL Python 2.5, 2.6, 3.2 and 3.3 2017-12-26 13:46:44 +02:00
Eric Le Lay b9c7fffca9 help branch prediction
was seing  ~5% drop in performance without it on 250 strings
2017-06-11 20:30:39 +02:00
Eric Le Lay ad280fd99e new reject_bytes option to raise on bytes
raise TypeError when encountering bytes in ujson.dumps() to prevent
unexpected Unicode exceptions in production.
Fixes #264
2017-06-11 11:58:10 +02:00
Mikhail Borisov 4481b8d53b Do not discard result of PyObject_CallObject()
PyObject_CallObject() returns a PyObject*; discarding it leaked
memory for the result of output.write().
2017-03-23 10:20:09 +03:00
Mikhail Borisov bc94d64fba Release saved raw JSON string.
Using ujson.dumps() with objects having __json__() method leaked memory
for object's JSON representation.
2017-03-23 10:19:06 +03:00
Joakim Hamren eb7d894f22 Integrated google's double-conversion lib
To fix issues with floating-point precision we've made use of Google's
double-conversion lib to handle conversions of doubles to and from strings.

In addition to fixing our precision problems this will improve double
encoding by 4-5x. Decoding is however slightly slower according to the
benchmarks - but accurate at least.

This change removes the double_precision encoding option and the
precise_float decoding option.
2017-02-14 12:20:04 +01:00
Joakim Hamren c9f8318bd8 Fix for incorrect order when using OrderedDict 2017-02-07 02:02:38 +01:00
Joakim Hamren 50181f060f Removed serialization of date/datetime objects
To better align with the standard json module this removes ujson
default serialization of date/datetime objects to unix-timestamps.

Trying to serialize such an object will now raise a TypeError "repr(obj)
is not JSON serializable".
2017-02-06 23:27:29 +01:00
Joakim Hamren 5f98f01095 Removed support for __json__ method on str
This functionality caused a performance regression without a use-case
justifying the trade-off.
2017-02-06 23:27:29 +01:00
Joakim Hamren 53f85b1bd6 Removed generic serialization of objects/iterables
The behavior of ujson has always been to try to serialize all objects in
any way possible. This has been quite a deviation from other json
libraries, including Pythons standard json module, and the source of a
lot of confusion and bugs. Removing this quirk moves ultrajson closer to
the expected behavior.

Instead of trying to coerce serialization ultrajson will now throw a
TypeError: "repr(obj) is not JSON serializable" exception.
2017-02-06 23:27:25 +01:00
Joakim Hamren ac4637fbc4 Following std json handling of None dict key
Previously a None dict item key would be outputted in JSON as "None".
To better align with the standard json module this was changed to output
"null". There's no proper representation of null object keys in JSON so
this is implementation specific but it seems more natural to follow
suit when it can be done without a significant performance hit.

Added and used branch prediction macros (LIKELY/UNLIKELY) as well.
2017-02-04 16:36:14 +01:00
Joakim Hamren 409c6d4006 Fix for overflowing long causing invalid json
This was caused by checking for "__json__" using PyObject_HasAttrString
which clears the error set by a previous long overflow. Thus this was dependent
on the order of processing of dict items, which explains why it was
seemingly random as the dict items are likely ordered by a hash of
the key.

This fixes GH224 and GH240.
2017-02-04 04:21:05 +01:00