oneDNN CPU performance optimizations Windows x64 & x86.
- Windows x64 & x86 packages:
  - oneDNN optimizations are enabled by default on X86 CPUs
- To explicitly enable or disable oneDNN optimizations, set the environment variable TF_ENABLE_ONEDNN_OPTS to 1 (enable) or 0 (disable) before running TensorFlow. To fall back to default settings, unset the environment variable.
- oneDNN optimizations can yield slightly different numerical results compared to when oneDNN optimizations are disabled due to floating-point round-off errors from different computation approaches and orders.
- To verify if oneDNN optimizations are on, look for a message with "oneDNN custom operations are on" in the log. If the exact phrase is not there, it means they are off.
Making the tf.function type system fully available:
- tf.types.experimental.TraceType now allows custom tf.function inputs to declare Tensor decomposition and type casting support.
- Introducing tf.types.experimental.FunctionType as the comprehensive representation of the signature of tf.function callables. It can be accessed through the function_type property of tf.functions and ConcreteFunctions. See the tf.types.experimental.FunctionType documentation for more details.
Introducing tf.types.experimental.AtomicFunction as the fastest way to perform TF computations in Python.
- Can be accessed through inference_fn property of ConcreteFunctions
- Does not support gradients.
- See tf.types.experimental.AtomicFunction documentation for how to call and use it.
tf.data:
- Moved option warm_start from tf.data.experimental.OptimizationOptions to tf.data.Options.
tf.lite:
- sub_op and mul_op support broadcasting up to 6 dimensions.
- The tflite::SignatureRunner class, which provides support for named parameters and for multiple named computations within a single TF Lite model, is no longer considered experimental. Likewise for the following signature-related methods of tflite::Interpreter:
  - tflite::Interpreter::GetSignatureRunner
  - tflite::Interpreter::signature_keys
  - tflite::Interpreter::signature_inputs
  - tflite::Interpreter::signature_outputs
  - tflite::Interpreter::input_tensor_by_signature
  - tflite::Interpreter::output_tensor_by_signature
- Similarly, the following signature runner functions in the TF Lite C API are no longer considered experimental:
  - TfLiteInterpreterGetSignatureCount
  - TfLiteInterpreterGetSignatureKey
  - TfLiteInterpreterGetSignatureRunner
  - TfLiteSignatureRunnerAllocateTensors
  - TfLiteSignatureRunnerGetInputCount
  - TfLiteSignatureRunnerGetInputName
  - TfLiteSignatureRunnerGetInputTensor
  - TfLiteSignatureRunnerGetOutputCount
  - TfLiteSignatureRunnerGetOutputName
  - TfLiteSignatureRunnerGetOutputTensor
  - TfLiteSignatureRunnerInvoke
  - TfLiteSignatureRunnerResizeInputTensor
- New C API function TfLiteExtensionApisVersion added to tensorflow/lite/c/c_api.h.
- Add int8 and int16x8 support for RSQRT operator
Android NDK r25 is supported.

Bug Fixes and Other Changes

Add TensorFlow Quantizer to TensorFlow pip package.
tf.sparse.segment_sum tf.sparse.segment_mean tf.sparse.segment_sqrt_n SparseSegmentSum/Mean/SqrtN[WithNumSegments]
- Added sparse_gradient option (default=false) that makes the gradient of these functions/ops sparse (IndexedSlices) instead of dense (Tensor), using new SparseSegmentSum/Mean/SqrtNGradV2 ops.
tf.nn.embedding_lookup_sparse
- Optimized this function for some cases by fusing internal operations.
tf.saved_model.SaveOptions
- Provided a new experimental_skip_saver argument which, if specified, will suppress the addition of SavedModel-native save and restore ops to the SavedModel, for cases where users already build custom save/restore ops and checkpoint formats for the model being saved, and the creation of the SavedModel-native save/restore ops simply cause longer model serialization times.
Add ops to tensorflow.raw_ops that were missing.
tf.CheckpointOptions
- It now takes in a new argument called experimental_write_callbacks. These are callbacks that will be executed after a saving event finishes writing the checkpoint file.
Add an option disable_eager_executer_streaming_enqueue to tensorflow.ConfigProto.Experimental to control the eager runtime's behavior around parallel remote function invocations; when set to True, the eager runtime will be allowed to execute multiple function invocations in parallel.
tf.constant_initializer
- It now takes a new argument called support_partition. If True, constant_initializers can create sharded variables. This is disabled by default, similar to existing behavior.
tf.lite
- Added support for stablehlo.scatter.
tf.estimator
- The tf.estimator API removal is in progress and will be targeted for the 2.16 release.

Keras

This will be the final release before the launch of Keras 3.0, when Keras will become multi-backend. For the compatibility page and other info, please see: https://github.com/keras-team/keras-core

Thanks to our Contributors

This release contains contributions from many people at Google, as well as:

Aiden Grossman, Akash Patel, Akhil Goel, Alexander Pivovarov, Andrew Goodbody, Ayan Moitra, Ben Barsdell, Ben Olson, Bhavani Subramanian, Boian Petkantchin, Bruce Lai, Chao Chen, Christian Steinmeyer, cjflan, David Korczynski, Donghak Park, Dragan Mladjenovic, Eli Kobrin, Fadi Arafeh, Feiyue Chen, Frédéric Bastien, guozhong.zhuang, halseycamilla, Harshavardhan Bellamkonda, James Ward, jameshollyer, Jane Liu, johnnkp, jswag180, justkw, Kanvi Khanna, Keith Smiley, Koan-Sin Tan, Kulin Seth, Kun-Lu, kushanam, Lu Teng, mdfaijul, Mehdi Drissi, mgokulkrish, mraunak, Mustafa Uzun, Namrata Bhave, Pavel Emeliyanenko, pemeliya, Peng Sun, Philipp Hack, Pratik Joshi, Rahul Batra, Raunak, redwrasse, Saoirse Stewart, SaoirseARM, seanshpark, Shanbin Ke, Spenser Bauman, Surya, sushreebarsa, Tai Ly, Thibaut Goetghebuer-Planchon, tilakrayal, Tirumalesh, Tj Xu, Vladislav, weihanmines, Wen Chen, wenchenvincent, wenscarl, William Muir, Zhoulong, Jiang