mirror of https://github.com/ultrajson/ultrajson.git synced 2024-11-22 11:22:04 +01:00

Ultra fast JSON decoder and encoder written in C with Python bindings

Go to file

renovate[bot] e97dc6d0a0 Update pypa/cibuildwheel action to v2.21.3		2024-11-01 01:55:25 +00:00
.github	Update pypa/cibuildwheel action to v2.21.3	2024-11-01 01:55:25 +00:00
deps/double-conversion	Fix typos found by codespell (#610 )	2023-10-17 18:45:44 +01:00
lib	Fix typos found by codespell (#610 )	2023-10-17 18:45:44 +01:00
python	Speedup dumps with sorted keys	2023-12-10 21:11:20 +00:00
scripts	Use cibuildwheel to build wheels.	2021-11-28 17:16:07 +00:00
tests	Speedup dumps with sorted keys	2023-12-10 21:11:20 +00:00
.gitignore	Add gcov coverage collecting of C code (#387 )	2021-03-30 22:35:12 +01:00
.pre-commit-config.yaml	Drop support for EOL Python 3.8	2024-10-08 23:10:35 +03:00
LICENSE.txt	Include BSD-3-Clause and TCL license text	2023-03-10 20:18:43 +00:00
MANIFEST.in	With setuptools_scm, MANIFEST.in only needs to include non-SCM files (or exclude SCM files)	2021-09-20 13:37:35 +03:00
pyproject.toml	Drop support for EOL Python 3.8	2024-10-08 23:10:35 +03:00
README.md	Drop support for EOL Python 3.8	2024-10-08 23:10:35 +03:00
RELEASING.md	Delete old TravisCI workflow and references.	2022-06-18 18:14:41 +01:00
setup.cfg	Drop support for EOL Python 3.8	2024-10-08 23:10:35 +03:00
setup.py	Support dynamically linking against system double-conversion library (#508 )	2022-02-17 19:38:09 +00:00

README.md

UltraJSON

UltraJSON is an ultra fast JSON encoder and decoder written in pure C with bindings for Python 3.9+.

Install with pip:

python -m pip install ujson

Project status

Warning

UltraJSON's architecture is fundamentally ill-suited to making changes without risk of introducing new security vulnerabilities. As a result, this library has been put into a maintenance-only mode. Support for new Python versions will be added and critical bugs and security issues will still be fixed but all other changes will be rejected. Users are encouraged to migrate to orjson which is both much faster and less likely to introduce a surprise buffer overflow vulnerability in the future.

Usage

May be used as a drop in replacement for most other JSON parsers for Python:

>>> import ujson
>>> ujson.dumps([{"key": "value"}, 81, True])
'[{"key":"value"},81,true]'
>>> ujson.loads("""[{"key": "value"}, 81, true]""")
[{'key': 'value'}, 81, True]

Encoder options

encode_html_chars

Used to enable special encoding of "unsafe" HTML characters into safer Unicode sequences. Default is False:

>>> ujson.dumps("<script>John&Doe", encode_html_chars=True)
'"\\u003cscript\\u003eJohn\\u0026Doe"'

ensure_ascii

Limits output to ASCII and escapes all extended characters above 127. Default is True. If your end format supports UTF-8, setting this option to false is highly recommended to save space:

>>> ujson.dumps("åäö")
'"\\u00e5\\u00e4\\u00f6"'
>>> ujson.dumps("åäö", ensure_ascii=False)
'"åäö"'

escape_forward_slashes

Controls whether forward slashes (/) are escaped. Default is True:

>>> ujson.dumps("https://example.com")
'"https:\\/\\/example.com"'
>>> ujson.dumps("https://example.com", escape_forward_slashes=False)
'"https://example.com"'

indent

Controls whether indentation ("pretty output") is enabled. Default is 0 (disabled):

>>> ujson.dumps({"foo": "bar"})
'{"foo":"bar"}'
>>> print(ujson.dumps({"foo": "bar"}, indent=4))
{
    "foo":"bar"
}

Benchmarks

UltraJSON calls/sec compared to other popular JSON parsers with performance gain specified below each.

Test machine

Linux 5.15.0-1037-azure x86_64 #44-Ubuntu SMP Thu Apr 20 13:19:31 UTC 2023

Versions

CPython 3.11.3 (main, Apr 6 2023, 07:55:46) [GCC 11.3.0]
ujson : 5.7.1.dev26
orjson : 3.9.0
simplejson : 3.19.1
json : 2.0.9

	ujson	orjson	simplejson	json
Array with 256 doubles
encode	18,282	79,569	5,681	5,935
decode	28,765	93,283	13,844	13,367
Array with 256 UTF-8 strings
encode	3,457	26,437	3,630	3,653
decode	3,576	4,236	522	1,978
Array with 256 strings
encode	44,769	125,920	21,401	23,565
decode	28,518	75,043	41,496	42,221
Medium complex object
encode	11,672	47,659	3,913	5,729
decode	12,522	23,599	8,007	9,720
Array with 256 True values
encode	110,444	425,919	81,428	84,347
decode	203,430	318,193	146,867	156,249
Array with 256 dict{string, int} pairs
encode	14,170	72,514	3,050	7,079
decode	19,116	27,542	9,374	13,713
Dict with 256 arrays with 256 dict{string, int} pairs
encode	55	282	11	26
decode	48	53	27	34
Dict with 256 arrays with 256 dict{string, int} pairs, outputting sorted keys
encode	42		8	27
Complex object
encode	462		397	444
decode	480	618	177	310

Above metrics are in call/sec, larger is better.

Build options

For those with particular needs, such as Linux distribution packagers, several build options are provided in the form of environment variables.

Debugging symbols

UJSON_BUILD_NO_STRIP

By default, debugging symbols are stripped on Linux platforms. Setting this environment variable with a value of 1 or True disables this behavior.

Using an external or system copy of the double-conversion library

These two environment variables are typically used together, something like:

export UJSON_BUILD_DC_INCLUDES='/usr/include/double-conversion'
export UJSON_BUILD_DC_LIBS='-ldouble-conversion'

Users planning to link against an external shared library should be aware of the ABI-compatibility requirements this introduces when upgrading system libraries or copying compiled wheels to other machines.

UJSON_BUILD_DC_INCLUDES

One or more directories, delimited by os.pathsep (same as the PATH environment variable), in which to look for double-conversion header files; the default is to use the bundled copy.

UJSON_BUILD_DC_LIBS

Compiler flags needed to link the double-conversion library; the default is to use the bundled copy.