TensorFlow Troubleshooting
Overview
Section titled “Overview”This guide covers common TensorFlow 2.x installation and runtime issues, including:
- Installation with GPU support
- TensorRT integration problems
- Keras compatibility issues
- TensorBoard profiler bugs
TensorFlow 2.x Installation
Section titled “TensorFlow 2.x Installation”Best Practices
Section titled “Best Practices”Step 1: Upgrade pip
Section titled “Step 1: Upgrade pip”pip install --upgrade pipStep 2: Install TensorFlow
Section titled “Step 2: Install TensorFlow”python3 -m pip install 'tensorflow[and-cuda]'This automatically installs compatible CUDA libraries.
pip install tensorflowMay require manual CUDA setup depending on your system configuration.
Step 3: Verify Installation
Section titled “Step 3: Verify Installation”python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"Expected output:
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]If you see an empty list [], check:
- NVIDIA drivers are installed - see Driver Installation
- CUDA version compatibility
- Environment activation - see Environment Setup
TensorRT Integration Issues
Section titled “TensorRT Integration Issues”Problem
Section titled “Problem”TensorFlow cannot find TensorRT even after installation, showing CUDA errors or warnings.
Solution
Section titled “Solution”Step 1: Install TensorRT
pip install nvidia-pyindex
pip install nvidia-tensorrtStep 2: Fix Library Path
# Replace 'user' with your username and adjust Python version as needed
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:"/home/user/miniconda3/envs/tf/lib/python3.11/site-packages/tensorrt_libs/"
# Make it persistent by adding to ~/.bashrc or conda environment activation script
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:"/home/user/miniconda3/envs/tf/lib/python3.11/site-packages/tensorrt_libs/"' >> ~/.bashrcKeras Compatibility Issues
Section titled “Keras Compatibility Issues”Error: AttributeError: module 'keras' has no attribute 'ops'
Section titled “Error: AttributeError: module 'keras' has no attribute 'ops'”Cause: Version mismatch between Keras and TensorFlow
Solutions:
# Instead of: import keras
from tensorflow import keras
# This ensures version compatibilitypip install keras==2.15.0 # Adjust based on TensorFlow versionimport tensorflow as tf
print(f"TensorFlow: {tf.__version__}")
print(f"Keras: {tf.keras.__version__}")TensorBoard Profiler Issues
Section titled “TensorBoard Profiler Issues”Problem: Profile Data Not Showing
Section titled “Problem: Profile Data Not Showing”Symptoms: TensorBoard profiler shows “No profile data was found” even though profiling ran successfully.
Root Cause: Log file structure bug in TensorBoard profiler.
Solution:
# Move profile logs up one directory level
# From: logs/train/plugins/profile/...
# To: logs/plugins/profile/...
cd logs
mv train/plugins/profile/* plugins/profile/ 2>/dev/null || true
mv validation/plugins/profile/* plugins/profile/ 2>/dev/null || trueThe profile logs should be at the same directory level as train and validation directories, not inside them.
GitHub Discussion
Detailed Solution Guide
Related Resources
Section titled “Related Resources”- Environment Setup - Python environment configuration
- GPU Detection - Troubleshoot GPU availability
- Driver Installation - CUDA and driver setup