augmentedManifestFile + PipeModeDataset example

It would really help to have a full end to end example of, say, image classification with augmentedManifestFile + PipeModeDataset  

as I keep getting errors of formats like 
`tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Could not parse example input, value: '����` 

I build a jsonl augumentedManifest with 
```
{'image-ref': s3://path/to/image, 'label': 3} 
{'image-ref': s3://path/to/image, 'label': 1}  
{'image-ref': s3://path/to/image, 'label': 2}  
```
then preparing training channel as 
```
train_data = sagemaker.session.s3_input(augmented_manifest_file_on_s3,
                                        distribution 	= 'FullyReplicated',
                                        content_type 	= 'image/jpeg',
                                        s3_data_type 	= 'AugmentedManifestFile',
                                        attribute_names	= ['image-ref', 'label'],
					input_mode 		= 'Pipe',
                                        record_wrapping = 'RecordIO') 

```
and launching the `.fit` as
```
data_channels = {'train': train_data}

# Train a model.
tf_estimator.fit(inputs=data_channels, logs=True)
```


in my entry script, I have 
```
	dataset = PipeModeDataset(channel = channel)
	dataset = dataset.prefetch(tf.data.experimental.AUTOTUNE)
	dataset = dataset.batch(2)
	dataset = dataset.map(combine)
	dataset = dataset.map(example_parser, num_parallel_calls=batch_size)
	dataset = dataset.repeat(epochs)
	dataset = dataset.batch(batch_size, drop_remainder=True)
	image_batch, label_batch = next(iter(dataset))

```

and as a modified example parser, I have  

`def example_parser(exemple1, exemple2):

	feat1 = tf.io.parse_single_example(
		exemple1,
		features={
			'image-ref'		: tf.io.FixedLenFeature([], tf.string),
		})

	feat2 = tf.io.parse_single_example(
		exemple2,
		features={
			'label'			: tf.io.FixedLenFeature([], tf.int64),
		})

	image 					= feat1['image-ref']
	image = tf.image.decode_jpeg(image, channels=3)
	image = tf.image.convert_image_dtype(image, tf.float32)
	label 					= tf.cast(feat2['label'], tf.int32)
	return image, label
`


What am I doing wrong ? 
The documentation [here](https://sagemaker.readthedocs.io/en/stable/using_tf.html#training-with-pipe-mode-using-pipemodedataset) is not clear about using augmented manifest files 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

augmentedManifestFile + PipeModeDataset example #63

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

augmentedManifestFile + PipeModeDataset example #63

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions