György KalmárIstvan Megyeri
Published © GPL3+

Spark Sound Detection From Raw Audio Data

We solve a spark sound recognition task from raw audio data on a SparkFun RedBoard Artemis ATP board. Spark detection on a SparkFun!

AdvancedFull instructions provided20 hours310
Spark Sound Detection From Raw Audio Data

Things used in this project

Hardware components

SparkFun RedBoard Artemis ATP
SparkFun RedBoard Artemis ATP
We recorded and classified spark sound with the Artemis board. A neural network was deployed that processes the raw audio signal in real-time.
×1
Arduino Due
Arduino Due
We used the Arduino Due to control electric spark generation.
×1
SparkFun Micro OLED Breakout (Qwiic)
SparkFun Micro OLED Breakout (Qwiic)
We employed the OLED screen to display classification results.
×1

Software apps and online services

Windows 10
Microsoft Windows 10
We built the project on Windows 10.
TensorFlow
TensorFlow
The neural network was developed in TensorFlow+Keras. Later, TensorFlow Lite Micro was used to deploy the trained model on the RedBoard Artemis ATP. Also, advanced methods were implemented in Keras to enhance model accuracy and robustness.
Ambiq SDK
The data acquisition was implemented by using the Ambiq SDK.

Story

Read more

Schematics

SparkFun RedBoard Artemis + OLED screen circuit diagram

SparkFun RedBoard Artemis + OLED screen schematics

Spark generator circuit diagram

Spark generator schematics

Code

Audio recording on the Artemis ATP

C/C++
The code implements an audio recorder that collects 12000 samples at the sampling rate of 11718 Hz after the device received a character 'r' on the Serial port. The example in the AmbiqSDK/boards_sfe/common/examples/pdf_fft was used, just replace the main.c with this file and rename it to "main.c".
//*****************************************************************************
//
//! @file pdm_fft.c
//!
//! @brief An example to show basic PDM operation.
//!
//! Purpose: This example enables the PDM interface to record audio signals from an
//! external microphone. The required pin connections are:
//!
//! Printing takes place over the ITM at 1M Baud.
//!
//! GPIO 10 - PDM DATA
//! GPIO 11 - PDM CLK
//
//*****************************************************************************

//*****************************************************************************
//
// Copyright (c) 2019, Ambiq Micro
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are met:
//
// 1. Redistributions of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// 2. Redistributions in binary form must reproduce the above copyright
// notice, this list of conditions and the following disclaimer in the
// documentation and/or other materials provided with the distribution.
//
// 3. Neither the name of the copyright holder nor the names of its
// contributors may be used to endorse or promote products derived from this
// software without specific prior written permission.
//
// Third party software included in this distribution is subject to the
// additional license terms as defined in the /docs/licenses directory.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
// POSSIBILITY OF SUCH DAMAGE.
//
// This is part of revision v2.2.0-7-g63f7c2ba1 of the AmbiqSuite Development Package.
//
//*****************************************************************************


#include "am_mcu_apollo.h"
#include "am_bsp.h"
#include "am_util.h"


//*****************************************************************************
//
// Example parameters.
//
//*****************************************************************************
#define PRINT_PDM_DATA              1
#define PDM_DATA_SIZE				12000

//*****************************************************************************
//
// Global variables.
//
//*****************************************************************************
volatile bool g_bPDMDataReady = false;
uint32_t g_ui32SampleFreq;
int16_t g_ui32PDMDataBuffer[PDM_DATA_SIZE];
uint8_t readBuffer[2];
uint32_t bytesWritten;


//*****************************************************************************
//
// PDM configuration information.
//
//*****************************************************************************
void *PDMHandle;

am_hal_pdm_config_t g_sPdmConfig =
{
		.eClkDivider = AM_HAL_PDM_MCLKDIV_1,
		.eLeftGain = AM_HAL_PDM_GAIN_0DB,
		.eRightGain = AM_HAL_PDM_GAIN_0DB,
		.ui32DecimationRate = 64,
		.bHighPassEnable = 0,
		.ui32HighPassCutoff = 0xB,
		.ePDMClkSpeed = AM_HAL_PDM_CLK_1_5MHZ,
		.bInvertI2SBCLK = 0,
		.ePDMClkSource = AM_HAL_PDM_INTERNAL_CLK,
		.bPDMSampleDelay = 0,
		.bDataPacking = 1,
		.ePCMChannels = AM_BSP_PDM_CHANNEL,
		.ui32GainChangeDelay = 1,
		.bI2SEnable = 0,
		.bSoftMute = 0,
		.bLRSwap = 0,
};

//*****************************************************************************
//
// PDM initialization.
//
//*****************************************************************************
void pdm_init(void)
{
	//
	// Initialize, power-up, and configure the PDM.
	//
	am_hal_pdm_initialize(0, &PDMHandle);
	am_hal_pdm_power_control(PDMHandle, AM_HAL_PDM_POWER_ON, false);
	am_hal_pdm_configure(PDMHandle, &g_sPdmConfig);
	am_hal_pdm_enable(PDMHandle);

	//
	// Configure the necessary pins.
	//
	am_hal_gpio_pinconfig(AM_BSP_PDM_DATA, g_AM_BSP_PDM_DATA);
	am_hal_gpio_pinconfig(AM_BSP_PDM_CLOCK, g_AM_BSP_PDM_CLOCK);

	//
	// Configure and enable PDM interrupts (set up to trigger on DMA
	// completion).
	//
	am_hal_pdm_interrupt_enable(PDMHandle, (AM_HAL_PDM_INT_DERR
			| AM_HAL_PDM_INT_DCMP
			| AM_HAL_PDM_INT_UNDFL
			| AM_HAL_PDM_INT_OVF));

	NVIC_EnableIRQ(PDM_IRQn);
}

//*****************************************************************************
//
// Print PDM configuration data.
//
//*****************************************************************************
void
pdm_config_print(void)
{
	uint32_t ui32PDMClk;
	uint32_t ui32MClkDiv;

	//
	// Read the config structure to figure out what our internal clock is set
	// to.
	//
	switch (g_sPdmConfig.eClkDivider)
	{
	case AM_HAL_PDM_MCLKDIV_4: ui32MClkDiv = 4; break;
	case AM_HAL_PDM_MCLKDIV_3: ui32MClkDiv = 3; break;
	case AM_HAL_PDM_MCLKDIV_2: ui32MClkDiv = 2; break;
	case AM_HAL_PDM_MCLKDIV_1: ui32MClkDiv = 1; break;

	default:
		ui32MClkDiv = 0;
	}

	switch (g_sPdmConfig.ePDMClkSpeed)
	{
	case AM_HAL_PDM_CLK_12MHZ:  ui32PDMClk = 12000000; break;
	case AM_HAL_PDM_CLK_6MHZ:   ui32PDMClk =  6000000; break;
	case AM_HAL_PDM_CLK_3MHZ:   ui32PDMClk =  3000000; break;
	case AM_HAL_PDM_CLK_1_5MHZ: ui32PDMClk =  1500000; break;
	case AM_HAL_PDM_CLK_750KHZ: ui32PDMClk =   750000; break;
	case AM_HAL_PDM_CLK_375KHZ: ui32PDMClk =   375000; break;
	case AM_HAL_PDM_CLK_187KHZ: ui32PDMClk =   187000; break;

	default:
		ui32PDMClk = 0;
	}

	//
	// Record the effective sample frequency. We'll need it later to print the
	// loudest frequency from the sample.
	//
	g_ui32SampleFreq = (ui32PDMClk /
			(ui32MClkDiv * 2 * g_sPdmConfig.ui32DecimationRate));

	am_util_stdio_printf("Settings:\n");
	am_util_stdio_printf("PDM Clock (Hz):         %12d\n", ui32PDMClk);
	am_util_stdio_printf("Decimation Rate:        %12d\n", g_sPdmConfig.ui32DecimationRate);
	am_util_stdio_printf("Effective Sample Freq.: %12d\n", g_ui32SampleFreq);
}

//*****************************************************************************
//
// Start a transaction to get some number of bytes from the PDM interface.
//
//*****************************************************************************
void
pdm_data_get(void)
{
	//
	// Configure DMA and target address.
	//
	am_hal_pdm_transfer_t sTransfer;
	sTransfer.ui32TargetAddr = (uint32_t) g_ui32PDMDataBuffer;
	sTransfer.ui32TotalCount = PDM_DATA_SIZE*2;

	//
	// Start the data transfer.
	//
	am_hal_pdm_enable(PDMHandle);
	am_util_delay_ms(100);
	am_hal_pdm_fifo_flush(PDMHandle);
	am_hal_pdm_dma_start(PDMHandle, &sTransfer);
}

//*****************************************************************************
//
// PDM interrupt handler.
//
//*****************************************************************************
void am_pdm0_isr(void)
{
	uint32_t ui32Status;

	//
	// Read the interrupt status.
	//
	am_hal_pdm_interrupt_status_get(PDMHandle, &ui32Status, true);
	am_hal_pdm_interrupt_clear(PDMHandle, ui32Status);

	//
	// Once our DMA transaction completes, we will disable the PDM and send a
	// flag back down to the main routine. Disabling the PDM is only necessary
	// because this example only implemented a single buffer for storing FFT
	// data. More complex programs could use a system of multiple buffers to
	// allow the CPU to run the FFT in one buffer while the DMA pulls PCM data
	// into another buffer.
	//
	if (ui32Status & AM_HAL_PDM_INT_DCMP)
	{
		am_hal_pdm_disable(PDMHandle);
		g_bPDMDataReady = true;
	}
}

//*****************************************************************************
//
// Main
//
//*****************************************************************************
int main(void)
{
	//
	// Perform the standard initialzation for clocks, cache settings, and
	// board-level low-power operation.
	//
	am_hal_clkgen_control(AM_HAL_CLKGEN_CONTROL_SYSCLK_MAX, 0);
	am_hal_cachectrl_config(&am_hal_cachectrl_defaults);
	am_hal_cachectrl_enable();
	//am_bsp_low_power_init();

	am_hal_gpio_pinconfig(AM_BSP_GPIO_LED_BLUE, g_AM_HAL_GPIO_OUTPUT);
	//am_hal_gpio_pinconfig(AM_BSP_GPIO_LED_BLUE, g_AM_BSP_GPIO_LED_BLUE);

	//
	// Initialize the printf interface for UART output
	//
	am_bsp_uart_printf_enable();

	//
	// Turn on the PDM, set it up for our chosen recording settings, and start
	// the first DMA transaction.
	//
	pdm_init();
	//You can print the current configuration
	//pdm_config_print();
	am_hal_pdm_fifo_flush(PDMHandle);

  //config UART receive: waiting for 1 B
	am_hal_uart_transfer_t transfer_config;
	transfer_config.ui32Direction = AM_HAL_UART_READ;
	transfer_config.pui8Data = readBuffer;
	transfer_config.ui32NumBytes = 1;
	transfer_config.ui32TimeoutMs = AM_HAL_UART_WAIT_FOREVER;
	transfer_config.pui32BytesTransferred = &bytesWritten;
  //config UART send: it send the whole data (12k samples)
	am_hal_uart_transfer_t transfer_config_writebuffer;
	transfer_config_writebuffer.ui32Direction = AM_HAL_UART_WRITE;
	transfer_config_writebuffer.pui8Data = (uint8_t*)g_ui32PDMDataBuffer;
	transfer_config_writebuffer.ui32NumBytes = PDM_DATA_SIZE*2;
	transfer_config_writebuffer.ui32TimeoutMs = AM_HAL_UART_WAIT_FOREVER;
	transfer_config_writebuffer.pui32BytesTransferred = &bytesWritten;
	//
	// Loop forever while sleeping.
	//
	while (1)
	{
		//waiting for a trigger from the PC
		while(am_bsp_com_uart_transfer(&transfer_config) != AM_HAL_STATUS_SUCCESS)
			;
		//if the received character is an 'r', start a 1 second long recording
		if(readBuffer[0] == 'r') {
			am_devices_led_on(am_bsp_psLEDs, 0);
			g_bPDMDataReady = false;
			am_hal_pdm_fifo_flush(PDMHandle);
			//start data collection by utilizing the DMA
			pdm_data_get();
			//go to sleep 
			am_hal_sysctrl_sleep(AM_HAL_SYSCTRL_SLEEP_DEEP);
			//wake-up trigger from the DMA
			while(!g_bPDMDataReady)
				;
			am_devices_led_off(am_bsp_psLEDs, 0);
			//send the data through the USB
			am_bsp_com_uart_transfer(&transfer_config_writebuffer);
		}

	}
}

Simple training process

Python
# -*- coding: utf-8 -*-
"""simple_pipeline.ipynb

Automatically generated by Colaboratory.

Original file is located at
    https://colab.research.google.com/drive/1zHi_UF1MfRlDoTdRtT9CsOAVtJHenCsa
"""

import numpy as np
import matplotlib.pyplot as plt
import os, glob

# Commented out IPython magic to ensure Python compatibility.
#@title Mount Google Drive to get data {display-mode: "form"}
from google.colab import drive
drive.mount('/gdrive')

from dataloader import *

#load data - all background noise classes - full dataset
(x_train,y_train,x_val,y_val,x_test,y_test) = getDataset(src_dir="../data",train_class=(0,1,2,3,4,5,6,7,8,10,11,12,13,14,15,16,17,18))
x_train = np.concatenate(x_train,axis=0)
if(x_train.ndim < 4):
  x_train = np.expand_dims(x_train, axis=2)
  x_train = np.expand_dims(x_train, axis=2)
y_train = np.concatenate(y_train,axis=0)
y_train = np.argmax(y_train, axis=1)

x_val = np.concatenate(x_val,axis=0)
if(x_val.ndim < 4):
  x_val = np.expand_dims(x_val, axis=2)
  x_val = np.expand_dims(x_val, axis=2)
y_val = np.concatenate(y_val,axis=0)
y_val = np.argmax(y_val, axis=1)

x_test = np.concatenate(x_test,axis=0)
if(x_test.ndim < 4):
  x_test = np.expand_dims(x_test, axis=2)
  x_test = np.expand_dims(x_test, axis=2)
y_test = np.concatenate(y_test,axis=0)
y_test = np.argmax(y_test, axis=1)

#final dataset split sizes
print(x_train.shape)
print(y_train.shape)
print(x_val.shape)
print(y_val.shape)
print(x_test.shape)
print(y_test.shape)

from tensorflow.keras import *
from tensorflow.keras.layers import *
from tensorflow.keras.optimizers import *

#create a very simple model
model = Sequential()
# 2 convolutional kernels
# Tf lite micro only supports Conv2D (Conv1D would be better for us here)
model.add(Conv2D(filters=2, kernel_size=(33,1), dilation_rate=2, activation='relu', input_shape=(12000,1,1)))
#Get the maximum - MaxPooling
model.add(MaxPool2D((11936,1)))
model.add(Flatten())
#produce an output [0-no spark --- 1-spark]
model.add(Dense(1, activation='sigmoid'))
#compile the model with the following training parameters
model.compile(loss='binary_crossentropy', optimizer=optimizers.Adam(), metrics=['binary_accuracy'])
# fit network
model.summary()
model.fit(x_train, y_train, validation_data=(x_val,y_val) ,epochs=10, batch_size=8, verbose=1, shuffle=True)
#evaluate network on test data
result = model.evaluate(x_test,y_test)
print("Accuracy on the test set: "+str(result[1]))

# Save tf.keras model in HDF5 format.
keras_file = "sparkCNN_baseline.h5"
tf.keras.models.save_model(model, keras_file)

# Convert to TensorFlow Lite model.
converter = tf.lite.TFLiteConverter.from_keras_model_file(keras_file)
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_LATENCY]
tflite_model = converter.convert()
open("sparkCNN_baseline.tflite", "wb").write(tflite_model)

# Install xxd if it is not available
!apt-get -qq install xxd
# Save the file as a C source file
!xxd -i sparkCNN_baseline.tflite > sparkCNN_baseline.cc
# Print the source file
!cat sparkCNN_baseline.cc

The modified micro_speech example that implements spark sound detection

C/C++
No preview (download only).

Credits

György Kalmár

György Kalmár

1 project • 0 followers
Istvan Megyeri

Istvan Megyeri

1 project • 0 followers

Comments