Artigo Revisado por pares

Contrast of multi‐resolution analysis approach to transhumeral phantom motion decoding

2021; Institution of Engineering and Technology; Volume: 6; Issue: 3 Linguagem: Inglês

10.1049/cit2.12039

ISSN

2468-6557

Autores

Ejay Nsugbe, Oluwarotimi Williams Samuel, Mojisola Grace Asogbon, Guanglin Li,

Tópico(s)

Neuroscience and Neural Engineering

Resumo

CAAI Transactions on Intelligence TechnologyVolume 6, Issue 3 p. 360-375 ORIGINAL RESEARCH PAPER Open Access Contrast of multi-resolution analysis approach to transhumeral phantom motion decoding Ejay Nsugbe, Corresponding Author ennsugbe@yahoo.com orcid.org/0000-0003-0674-1611 Independent Scholar Correspondence Ejay Nsugbe, Prospect Place, Swindon, SN1 3LG, UK. E-mail: ennsugbe@yahoo.com Oluwarotimi William Samuel, Key Laboratory of Human-Machine Intelligence-Synergy Systems, Chinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China. Email: samuel@siat.ac.cnSearch for more papers by this authorOluwarotimi William Samuel, Corresponding Author samuel@siat.ac.cn Key Laboratory of Human-Machine Intelligence-Synergy Systems, Chinese Academy of Sciences (CAS), Shenzhen Institutes of Advanced Technology, Shenzhen, China Correspondence Ejay Nsugbe, Prospect Place, Swindon, SN1 3LG, UK. E-mail: ennsugbe@yahoo.com Oluwarotimi William Samuel, Key Laboratory of Human-Machine Intelligence-Synergy Systems, Chinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China. Email: samuel@siat.ac.cnSearch for more papers by this authorMojisola Grace Asogbon, Key Laboratory of Human-Machine Intelligence-Synergy Systems, Chinese Academy of Sciences (CAS), Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaSearch for more papers by this authorGuanglin Li, Key Laboratory of Human-Machine Intelligence-Synergy Systems, Chinese Academy of Sciences (CAS), Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaSearch for more papers by this author Ejay Nsugbe, Corresponding Author ennsugbe@yahoo.com orcid.org/0000-0003-0674-1611 Independent Scholar Correspondence Ejay Nsugbe, Prospect Place, Swindon, SN1 3LG, UK. E-mail: ennsugbe@yahoo.com Oluwarotimi William Samuel, Key Laboratory of Human-Machine Intelligence-Synergy Systems, Chinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China. Email: samuel@siat.ac.cnSearch for more papers by this authorOluwarotimi William Samuel, Corresponding Author samuel@siat.ac.cn Key Laboratory of Human-Machine Intelligence-Synergy Systems, Chinese Academy of Sciences (CAS), Shenzhen Institutes of Advanced Technology, Shenzhen, China Correspondence Ejay Nsugbe, Prospect Place, Swindon, SN1 3LG, UK. E-mail: ennsugbe@yahoo.com Oluwarotimi William Samuel, Key Laboratory of Human-Machine Intelligence-Synergy Systems, Chinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China. Email: samuel@siat.ac.cnSearch for more papers by this authorMojisola Grace Asogbon, Key Laboratory of Human-Machine Intelligence-Synergy Systems, Chinese Academy of Sciences (CAS), Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaSearch for more papers by this authorGuanglin Li, Key Laboratory of Human-Machine Intelligence-Synergy Systems, Chinese Academy of Sciences (CAS), Shenzhen Institutes of Advanced Technology, Shenzhen, ChinaSearch for more papers by this author First published: 20 April 2021 https://doi.org/10.1049/cit2.12039 AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onEmailFacebookTwitterLinked InRedditWechat Abstract In signal processing, multiresolution decomposition techniques allow for the separation of an acquired signal into sub levels, where the optimal level within the signal minimises redundancy, uncertainties, and contains the information required for the characterisation of the sensed phenomena. In the area of physiological signal processing for prosthesis control, scenarios where a signal decomposition analysis are required: the wavelet decomposition (WD) has been seen to be the favoured time-frequency approach for the decomposition of non-stationary signals. From a research perspective, the WD in certain cases has allowed for a more accurate motion intent decoding process following feature extraction and classification. Despite this, there is yet to be a widespread adaptation of the WD in a practical setting due to perceived computational complexity. Here, for neuromuscular (electromyography) and brainwave (electroencephalography) signals acquired from a transhumeral amputee, a computationally efficient time domain signal decomposition method based on a series of heuristics was applied to process the acquired signals before feature extraction. The results showed an improvement in motion intent decoding prowess for the proposed time-domain-based signal decomposition across four different classifiers for both the neuromuscular and brain wave signals when compared to the WD and the raw signal. 1 INTRODUCTION The functional bionic arm, commonly referred to as a myoelectric prosthesis, represents the most technologically advanced replacement for the loss of an upper-limb [1, 2]. The myoelectric prosthesis limb comprises of a control system that is central to the effective functioning of the arm as it serves as the intermediary between an acquired bio-potential signal from an amputated individual. This, in turn, represents an encoding of a desired gesture motion, and a respective machine actuation signal which drives the motors and actuators in the prosthesis arm towards the completion of the desired motion [1, 2]. Different control schemes have been investigated for the prosthesis controller, and the pattern recognition-based controller is seen to be the favoured scheme due to its intuitiveness and facilitation of control of the prosthesis limb, as reported by sources in the literature [1, 2]. The pattern recognition control scheme is based around the association and decoding of various bio-potential signals into a relevant respective control signal, which drives the prosthesis limb into performing an inferred gesture motion [3, 4]. Assuming the acquisition of a bio-potential signal typically in the form of a time-series, the pattern recognition process can be broken down into the feature extraction phase where relevant parametric descriptors are computed from the signal, and also be used as a means of dimensionality reduction, from which a feature vector is formed [3, 4]. The feature vector is used to train a classifier whose role is to form the best representation that maps a set of input features to an associated label and motion intent. The supervised learning methods are the popular classification methods used in this area of research. Results have shown that the range of features extracted to build a feature vector alongside the chosen classifier influences the extent to which motion intent signals can be robustly decoded [5]. The signal analysis and projection domain typically dictate what category of features can be extracted from the signal and, to a lesser extent, has been subject to investigation in this area of research [6]. As a means of the method by which signals are acquired, and in terms of computational efficiency, the time domain can be said to be mostly the default analysis domain for the majority of signals [6]. Depending on the application in question, the frequency domain has also been seen to be a favoured analysis domain, as this allows for the deconvolution of signals into a spectrum comprising the various frequency contributions in the signal, and the extraction of frequency-based features [7, 8]. The fast Fourier transform (FFT), which uses a geometric series of sines and cosines of increasing frequencies as its base function, is the widely used frequency transformation method that decomposes time-series signals into their frequency components but has a shortcoming of discarding the temporal information associated with the signal [7, 8]. The short time Fourier transform (STFT) has been seen to be a workaround for this shortcoming as it allows for a time-frequency representation of the signal, since it is a windowed Fourier transform across all time intervals in a time-series [9, 10]. The windowing requirement for the STFT implies that a fixed window is used for all frequency components, which in practicality causes inaccuracies in the transform depending on the windowing option used [9, 10]. An uncertainty principle in the STFT exists, where either accuracy regarding the signal resolution or time localisation, is lost [9, 10]. The wavelet decomposition (WD) is a multiresolution method which provides time-frequency information on a signal, and is an improvement of the STFT's windowing problem as it performs windowing-based adaptation as a means of customising its analysis approach to varying frequency levels within a signal [11]. Unlike the FFT, which decomposes a signal using sines and cosines, the WD allows for flexibility of ‘mother wavelet’, which serves a basis function for the decomposition process, and as a result, has made this approach favourable for stationary and non-stationary time-series signals [11]. In the area of prosthesis control, the wavelet transform has been used for intent decoding purposes from acquired bio-potential signals, although the results have shown that this is an effective domain for signal analysis and feature extraction, its widespread adoption in real-time prosthesis control interfaces are constrained by its computational complexity [12]. From the signal processing literature, Nsugbe et al. designed a time-domain-based algorithm applied for source separation of a mixing process using a resulting non-stationary signal from the process [13-19]. This approach is a time-domain-based decomposition method that is centred around an iterative optimisation flow and uses heuristic reasoning and tuned linear amplitude thresholds as a basis for the signal separation process [13-21]. This method has been seen to be more computationally efficient than the WD, since once the optimal threshold level is determined, it can be implemented using the relevant parameters, or using analogue circuitry to rectify the signal and reduce the redundancies, leaving just the optimal segment of the signal for further processing [13-19]. The main shortcoming of this signal processing method is based around the lack of frequency information offered as part of the decomposition, as it is purely a time-domain-based approach, and hence a further computation of the FFT would be necessary for applications where frequency information is relevant [13-19]. This is not deemed to be an immediate limitation for the case of the pattern recognition-based control scheme considered as part of this manuscript. Thus, here, the specific investigation and contributions are as follows: - A mathematical formalism of the proposed time-domain-based decomposition algorithm. - Investigation of the effectiveness of a motion intent decoding approach from bio-potentials acquired from electromyography (EMG) and electroencephalography (EEG), with the application of the proposed time-domain-based method alongside a reduced set of features. - A results benchmarking exercise between the proposed method, raw signal and wavelet decomposed version of the signal across four different classifiers. 2 MATERIALS AND METHODS The data used for the signal processing case study presented were acquired from the study conducted by Li et al. [22]. This section describes the theoretical model behind the various sensing modules used to acquire data, assuming the solution to the forward problem, and according to the data collection procedure by Li et al. [22]; followed by the signal processing and classification architectures employed here. 2.1 Mathematical models and data collection instrumentation The theoretical formulations behind action potentials are useful numerical tools that help to describe the individualistic anatomical and physiological contributors which, under fixed conditions, produce superimposed extracellular recordable action potentials acquired using a set of surface electrodes. Assuming the forward problem framework, an overview of the theoretical concept behind the EMG and EEG signals, alongside the recording instruments used by Li et al. [22], is described as follows: 2.1.1 EMG underlying principle EMG signals are electrophysiological signals present within muscle tissue and represent superimposed motor unit action potentials (MUAPs). During muscular contraction, the simultaneous firing of MUAPs occurs, thus the resulting EMG signals are in the form of non-stationary time-series, dependent on intrinsic anatomical properties [23, 24]. The dynamics of electrophysiology can be modelled as electrical current flowing through tissue using the principle of volume conduction and a three-dimensional (3D) view of Ohm's law [23, 24]. For a biological tissue with uniform conductivity σ i recorded at point P0 across (x0, y0, z0), and generated by a source current Is at point P (x, y, z), the 3D electric potential can be described as seen in Equation (1) [23, 24]: V p 0 = 1 4 π σ i I s r i (1)where V p 0 is the electrical voltage potential, and r i represents the shortest distance between points P0 and P. Equation (1) helps to show the dependency of the voltage potential recorded at a specific point with the source current, and the inverse relationship with σ i and r i values [23-25]. However, due to the biophysics of motor units and their simultaneous firing patterns, these sorts of bioelectric phenomena are represented as linear superpositions emanating from multiple sources [23-25]. Using the theory of dipoles and the framework of the variation of electric potential membranes within an electrical field in an extracellular medium, Wilson et al. [26] and Plonsey and Barr [27] postulated that the behaviour of bioelectricity is similar to the fields produced by a lumped dipole. On this note, a fibre element with length d x with a focused current flow in the vicinity of an extracellular potential, is expressed as p − . d x , where p − is the dipole current/unit length [26, 27]. As the current propagates from the source into an unbounded space, the resulting extracellular potential can be expressed as seen in Equation (2) [26, 27]: d ϕ e = 1 4 π σ e . d ( 1 r ) d x . p − ( x , t ) d x (2)where ϕ e is an extracellular potential, σ e is the conductivity of the extracellular medium, and r is the distance from the excitation source to the recording point P o . If r is located along the co-ordinates of P and P o then the distance can be calculated using Equation (3) [26, 27]: r = [ ( x − x o ) 2 + ( y − y o ) 2 + ( z − z o ) 2 ] 1 2 (3) From these equations, an integral sum of the various potentials can then be computed to obtain the resulting field from the lumped dipole element as shown in Equation (4) [26, 27]: ϕ e ( x o , y o , z o , t ) = ∫ x = − ∞ x = ∞ p − ( x , t ) 4 π σ e [ ( x − x o ) 2 + y o 2 + z o 2 ] 3 / 2 d x (4)where t is time. 2.1.2 EMG sensors and signal acquisition Li et al. [22] utilised 32-channel high-density surface electrodes with the REFA 128 model, TMS International BV, Netherlands, distributed around the stump and deltoid of the amputee subject. The data was sampled at 1024 Hz, and the acquisition electronics contained a bandpass filter in the frequency region of 10–500 Hz, and a 24-bit resolution [22]. 2.1.3 EEG underlying principle The human brain comprises billions of neurons of varying geometries, depending on the section. An action potential from a single neuronal cell can be said to produce a negligible amount of electrical potential, which in turn is challenging to record with acquisition electrodes [28, 29]. When a cluster of neuronal cells activate simultaneously, the resulting electric potential is substantial enough to be acquired with electrodes to form an EEG signal [28, 29]. The flow of bioelectrical current within the brain tissue causes the production of an electromagnetic field within the skull which, like the EMG phenomena, can be modelled using dipole theory [28, 29]. The forward problem in the case of EEG can be numerically established assuming a set of acquisition electrodes, tissue conductivity and model of the head [28, 29]. For a dipole at a point S, a multilayer head model L with radius spanning 0 < r1 < r2 … < rL, of moment q, the potential (u), measured at point x (|| x || = rL), can be seen in Equation (5) [30, 31]: u ( r s , q , x ) = | | q | | 4 π σ L r L 2 = ∑ n = 1 ∞ 2 n + 1 n ( r s r L ) n − 1 f n [ n cos ∝ P n ( cos γ ) + cos β sin ∝ P n 1 ( cos γ ) ] (5)where ∝ is the angle between S and q, β is the angle between S and the signal acquisition point, x , γ is the angle between two vectors denoted by point S and q on one side and S and x on the other side, and P n and P n 1 represent the Legendre polynomial coefficient associated with the series. 2.1.4 EEG sensors and signal acquisition The EEG sensors used by Li et al. [22] were the 64-channel EasyCap, Herrsching, Germany, with the Al-AgCl electrodes, and Neuroscan system version 4.3. The signals were acquired at a sample rate of 1024 Hz with bandpass filtering at 0.05–100 Hz [22]. 2.2 Data collection process and electrode channel selection The data collection process by Li et al. [22] included the acquisition of EMG and EEG from a group of subjects who had been amputated for traumatic reasons, and the study was granted ethical approval by the Institutional Review Board of Shenzhen Institutes of Advanced Technology, with a unique reference number of SIAT-IRB-150515-H0077. A single amputee's data set has been used for the work presented. The subject is a transhumeral amputee of 49 years of age whose left side has been amputated, three years post amputation with the stump length of 20 cm, measured from the shoulder downwards [22]. Data was collected for five gesture motions, namely Hand Open (HO), Hand Close (HC), Wrist Pronation (WP), Wrist Supination (WS) and No Movement (NM). A sum of 10 repetitions was performed for each gesture set, with as close as possible to a constant contraction level, and with breaks factored in as required. The experimental setup can be seen in Figure 1. FIGURE 1Open in figure viewerPowerPoint A representative subject performing the arm tasks during experimental session [22] 2.2.1 Electrode channel selection An electrode channel selection process was employed by Li et al. [22] as part of a pre-processing prior to the signal processing, as a means of a channel reduction which can also be considered a data dimensionality reduction phase, allowing for a quicker computation time [32]. A greedy search algorithm termed the sequential forward selection (SFS), was applied by Li et al. [22] to prune out 10 optimal electrode channels, a summary of which can be seen in the flow sequence below: Step 1: Iteration loop initialisation S 0 = { ∅ } Step2:Maximise acquisition during selection as in; Acc ( S k + x ∗ ) = argmax Acc ( S k + x j ∗ ) Step 3: Repeat for all k∈ {1,2, … n} Step 4: Terminate loop after nth item in set k has been iterated where S k is an already selected electrode, Acc is the classification accuracy, x j ∗ is an electrode channel, k represents the full set of electrode channels, and j is the j th element in set k. From this, a reduction from 96 electrode channels (EMG and EEG) to a sum of 20 electrodes was achieved, with the optimal channels for the EMG spread across the deltoids, bicep and triceps. It was seen that a number of the optimal EEG electrode channels were spread around the visual portion of the cerebral cortex responsible for the projection of visual information, thereby implying the importance of motor imagery in the control of phantom motions [22, 32, 33]. For the gesture motion data from each electrode, each one was divided into 10, corresponding to the number of repetitions made during each acquisition exercise, with each division comprising 512 sample points each. 2.3 Signal processing and classification 2.3.1 Analysis of domains and signal decomposition Time domain The proposed method works with the concept that, for an absolute representation of a non-stationary time series signal, a transduced manifestation of a single event in time can be denoted by a unit impulse peak and exponential decay: s ( t ) = e − a t u ( t ) + ζ for u ( t ) = { 0 , a < 0 1 , a ≥ 0 .Where t is the impulse time, u ( t ) is a step function, a indicates that the function is 0 until a ≥ 0 , ζ is additive white noise. The unit impulse can be characterised by its time, amplitude and characteristics related to its decay [34]. For signals that have consecutive events occurring, the time-series contains overlapping impulse peaks which cause the decay characteristics associated with each impulse peak to become challenging to localise and characterise [16]. Due to this, the time and amplitude characteristics represent key properties to the signal, as described by Nsugbe et al. [16]. Thus, it can be said that an optimal amplitude region within a given time-series is one where source information is maximised, and interferences (noise and uncertainties) are minimal [16]. As mentioned, the proposed method applies a series of heuristically tuned linear thresholds of varying amplitudes where, for each amplitude region, the amplitude of the peaks are localised, from which a sub time-series is formed [13-19]. That is to say, given signal s ( t ) , and a single amplitude threshold, a sub time-series is formed: X i j = t l s ( t ) . Where X i j is a sub time-series for the i t h threshold iteration and j th division within the signal for i = ( 1,2 , … , n ) and j = ( 1,2 , … , n ) , and t l is the amplitude level of the threshold. For the first iteration of a single heuristically tuned threshold within a time-series which yields two time-series divisions: X = { X 11 , X 12 } , where X 11 represents the first threshold iteration and the first time-series division, and X 12 represents the first threshold iteration and the second time-series division. From this, features are extracted for each item in set X and an iterative improvement exercise is conducted to find an optimal region within the signal using a chosen performance index J , and the j th division which yields the a r g m a x ( J ). The resulting X 11 , X 12 are further decomposed using a successively tuned linear threshold based on heuristic reasoning for further i th threshold iteration, until a conditional minimum is found mimicking a convex optimisation problem. The amplitude threshold parameter used to compute the signal level which produces the minimum is referred to as X o p t . The threshold parameters of X o p t are referred to as the optimal amplitude decomposition level within the time-series, and are used for the analysis and further processing of subsequent signals from the same source, as it is believed that this generalises across further kinds of signals acquired from the source, assuming the recording instrumentation remains unchanged. A resulting time-series signal from a tuned threshold region accompanied by a peak detection process represents a filtered and decomposed reduced dimensional time-series. The following are the heuristics and steps taken to implement the proposed method and determine the optimal region in the time-series: Step 1.Assuming a multitude of time-series: S n = x 1 , x 2 , … , x N , containing information for various signals that need to be classified, express each time in its absolute form denoted by | S n |. Step 2.For each of S n , set an initialisation threshold – for this work, this was arbitrarily tuned to be 50 % of the absolute maximum of the signal, thus a scale factor of 0.5, denoted by t l and expressed by t l ∗ m a x | x n | . For cases with multiple time-series resulting from multiple sample repetitions and electrode channels, the expression can be used to obtain an initialisation threshold value t l ∗ ( 1 N ∑ n = 1 N max | S n | ) where N = number of samples, and S n represents the nth time series. Note that the selection of the t l parameter can be chosen as required and adds a degree of flexibility to this decomposition method. In a previous study, t l has been selected with an a priori value [13-19]. Step 3.For each of the subsequent sub time-series, X = { X 11 , X 12 } obtained for the various S n , identify the peaks within the signal and form a filtered sub time-series X f i l t = { X 11 f i l t , X 12 f i l t } , where a peak can be said to be a data sample whose amplitude is either greater than or equal to its nearest neighbours, that is for a time-series S n . The peaks within this time-series can be said to be samples that satisfy the criteria, expressed mathematically as: S n . p e a k s ( x ) = { x p e a k . n , x n ≥ x n − 1 and x n + 1 0 , O t h e r w i s e . At this stage, a selection process is conducted to assess the information quality and ‘goodness’ within each sub time-series. For the purpose of this exercise, the mean of the peaks (MP) and cumulative sum of peaks (SP) are extracted from to form a feature vector. J was chosen to be the Euclidean distance metric used to compute the distance between points in Euclidean space, normalised by the standard deviation of each of the time series in question, and can

Referência(s)