该库是 python_speech_features 的完整 Kotlin 端口,可用于 Android 和 iOS 项目。
它为自动语音识别 (ASR) 提供常见的语音特征,包括 MFCC 和滤波器组能量。
要了解有关 MFCC 的更多信息,请阅读更多。
我们使用 Kotlin 多平台支持多个平台。
将 jitpack.io 添加到您的项目仓库中
allProjects {
repositories {
google()
maven { url 'https://jitpack.io' }
}
}
添加依赖
dependencies {
implementation "com.github.MerlynMind:kotlin_speech_features:${version}"
}
此仓库中包含一个示例应用程序,以帮助理解实现。
private val speechFeatures = SpeechFeatures()
val result = speechFeatures.mfcc(MathUtils.normalize(wav), nFilt = 64)
val result = speechFeatures.fbank(MathUtils.normalize(wav), nFilt = 64)
val result = speechFeatures.logfbank(MathUtils.normalize(wav), nFilt = 64)
val result = speechFeatures.ssc(MathUtils.normalize(wav), nFilt = 64)
File > Add Packages...
Add Package
按钮此仓库中包含一个示例应用程序,以帮助理解实现。
KotlinIntArray
形式并对其进行标准化。import KotlinSpeechFeatures
let signal = [Int](1...1000) // Example signal
let normalized = MathUtils.Companion.init().normalize(sig: toKotlinIntArray(arr: signal))
func toKotlinIntArray(arr: [Int]) -> KotlinIntArray {
let result = KotlinIntArray(size: Int32(arr.capacity))
for i in 0...(arr.count-1) {
result.set(index: Int32(i), value: Int32(arr[i]))
}
return result
}
let speechFeatures = SpeechFeatures()
let result = speechFeatures.mfcc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, numCep: 13, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: ni;, preemph: 0.97, ceplifter: 22, appendEnergy: true, winFunc: nil)
let result = speechFeatures.fbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)
let result = speechFeatures.logfbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)
let result = speechFeatures.ssc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)
Coming soon...
有兴趣为该库做出贡献吗? 非常感谢您的兴趣! 我们一直在寻找对项目的改进,并非常感谢开源开发人员的贡献。
git checkout https://github.com/merlynmind/kotlin_speech_features -b name_for_new_branch
如果您想表示感谢和/或支持此库的积极开发
非常感谢您对扩大我们图书馆影响力的兴趣!
wget http://voyager.jpl.nasa.gov/spacecraft/audio/english.au
sox english.au -e signed-integer english.wav