Data Processing

This is the data processing pipeline for 3D shape and texture generation.

Notes:

This implementation is a simplified version of our industrial pipeline.
The rendering script is based on TRELLIS.

Rendering

Motivation

The rendering script render/render.py serves three main purposes:

Converting complex 3D formats to PLY files using Blender for further processing.
Rendering condition images for DiT training.
Rendering orthogonal images, PBR materials, and conditional signals (world-space normals and positions) for texture generation.

Requirements

The rendering scripts are executed with Blender 4.1. You need to install opencv, OpenEXR, and Imath using Blender's Python. Here is an example for a Macbook:

/Applications/Blender.app/Contents/Resources/4.1/python/bin/python3.11 -m pip install OpenEXR Imath opencv-python

Execution

The first two purposes can be executed with a single command:

$BLENDER_PATH -b -P render/render.py -- \
    --object ${INPUT_FILE} --geo_mode --resolution 512 \
    --output_folder $OUTPUT_FOLDER

For the third purpose, simply remove the --geo_mode flag.

Watertight Mesh Processing and Sampling

Motivation

To learn an SDF representation for 3DShape2VecSets, we require a watertight input mesh. This pipeline processes raw triangle meshes to generate three essential data types:

Surface samples - Input points for the encoder.
Volume samples - Query points for SDF evaluation in the decoder.
Volume SDFs - Ground-truth signed distance values for VAE training.

Execution

Process a triangle mesh (OBJ/OFF format) to generate:

Watertight mesh (${OUTPUT_NAME}_watertight.obj).
Surface point samples (${OUTPUT_NAME}_surface.npz).
Volume samples with SDFs (${OUTPUT_NAME}_sdf.npz).

Command:

python3 watertight/watertight_and_sample.py \
    --input_obj ${INPUT_MESH} \
    --output_prefix ${OUTPUT_NAME}

Output Data Format

1. Surface Samples (`${OUTPUT_NAME}_surface.npz`)

Contains two point cloud arrays in numpy NPZ format:

Key	Shape	Format	Description
`random_surface`	`(N, 6)`	`float16`	Uniform point samples on surface
`sharp_surface`	`(M, 6)`	`float16`	Samples near sharp mesh edges

2. Volume SDF Samples (`${OUTPUT_NAME}_sdf.npz`)

Contains three sample types stored as array pairs. For each type ${type}:

Sample Type	Points Array	SDF Labels Array	Shape	Format	Description
`vol`	`vol_points`	`vol_label`	`(P, 3)/(P,)`	`float16`	Random spatial samples
`random_near`	`random_near_points`	`random_near_label`	`(Q, 3)/(Q,)`	`float16`	Samples near surface
`sharp_near`	`sharp_near_points`	`sharp_near_label`	`(R, 3)/(R,)`	`float16`	Samples near sharp edges

Data Specifications:

All point coordinates (*_points arrays) contain 3D positions stored as float16 values.
All SDF values (*_label arrays) are float16 scalars representing:
- Positive values: Outside the surface.
- Negative values: Inside the surface.
- Zero values: On the surface.
Array dimensions:
- N, M, P, Q, R represent sample counts (vary per shape).
- 3 indicates XYZ coordinates.
- 6 indicates XYZ/Normal coordinates.
All arrays are stored uncompressed in numpy's NPZ format.

Overall Script

Modify the first four variables in pipeline.sh:

INPUT_FILE The path to each 3D data file.
OUTPUT_FOLDER The overall path for the output dataset.
NAME The naming for the output path of each data.
BLENDER_PATH The executable path for Blender.

Then run the following script:

bash pipeline.sh

Data Processing

Rendering

Motivation

Requirements

Execution

Watertight Mesh Processing and Sampling

Motivation

Execution

Output Data Format

1. Surface Samples (${OUTPUT_NAME}_surface.npz)

2. Volume SDF Samples (${OUTPUT_NAME}_sdf.npz)

Overall Script

1. Surface Samples (`${OUTPUT_NAME}_surface.npz`)

2. Volume SDF Samples (`${OUTPUT_NAME}_sdf.npz`)