Wei - Macao Polytechnic University

Data Structures and Algorithms
Computer Security
Functional Analysis
Functional Programming
Category Theory

Wei Ke received the Ph.D. degree from the School of Computer Science and Engineering, Beihang University, China. He is currently a Professor with the Ph.D. in Computer Applied Technology Program, Macao Polytechnic University. His research interests include computer vision, image processing, computer graphics, programming languages and tool support for object-oriented and component-based engineering and systems. His research interests also include the design and implementation of open platforms for applications of computer graphics and pattern recognition, including programming tools, environments, and frameworks.

Current Employer/Organization

Macao Polytechnic University - Faculty of Applied Sciences
Current Position

Professor
Macao Polytechnic University
Subjects Taught
- Data Structures and Algorithms
- Computer Security
- Functional Analysis
- Functional Programming
- Category Theory
Education
- PhD in Computer Applied Technology, School of Computer Science and Engineering, Beihang University, Beijing, China
- MSc in Computer Software and Theory, Institute of Software, Chinese Academy of Sciences, Beijing, China
- BSc in Computer Software, Department of Computer Science, Sun Yat-sen University, Guangzhou, Guangdong, China
Research Interests
- programming languages
- component-based software engineering and systems
- computer vision and pattern recognition
- image processing

Journal papers
1. H. Chi, Y. Lu, C. Xie, W. Ke and B. Chen, "Spatio-temporal attention based collaborative local–global learning for traffic flow prediction," in Engineering Applications of Artificial Intelligence, vol. 139 (B), 2025, doi: 10.1016/j.engappai.2024.109575.
  Abstract. Traffic flow prediction is crucial for intelligent transportation systems (ITS), providing valuable insights for traffic control, route planning, and operation management. Existing work often separately models the spatial and temporal dependencies and primarily relies on predefined graphs to represent spatio-temporal dependencies, neglecting the traffic dynamics caused by unexpected events and the global relationships among road segments. Unlike previous models that primarily focus on local feature extraction, we propose a novel collaborative local–global learning model (LOGO) that employs spatio-temporal attention (STA) and graph convolutional networks (GCN). Specifically, LOGO simultaneously extracts hidden traffic features from both local and global perspectives. In local feature extraction, a novel STA is devised to directly attend to spatio-temporal coupling interdependencies instead of separately modeling temporal and spatial dependencies, and to capture in-depth spatio-temporal traffic context with an adaptive graph focusing on the dynamics in traffic flow. In global feature extraction, a global correlation matrix is constructed and GCNs are utilized to propagate messages on the obtained matrix to achieve interactions between both adjacent and similar road segments. Finally, the obtained local and global features are concatenated and fed into a gated aggregation to forecast future traffic flow. Extensive experiments on four real-world traffic datasets sourced from the Caltrans Performance Measurement System (PEMS03, PEMS04, PEMS07, and PEMS08) demonstrate the effectiveness of our proposed model. LOGO achieves the best performance over 18 state-of-the-art baselines and the best prediction performance with the highest improvement of 6.06% on the PEMS07 dataset. Additionally, two real-world case studies further substantiate the robustness and interpretability of LOGO.
2. M.Q. Zhang, W. Ke, Y.L. He, Q.X. Zhu and Y. Xu, "Group-Sparse Differential Reweighted Latent Matrix Factorization for Consistency Completion Under Non-Uniform Sensor Failures," in IEEE Sensors Journal, doi: 10.1109/JSEN.2024.3519357.
  Abstract. In the contemporary industrial landscape, the widespread deployment of data collection units has become the standard, significantly enhancing the synchronization of data-driven control and monitoring systems. However, high noise levels and sensor failures frequently lead to non-uniform data loss, including random and block missing, which severely hinders the real-time integration of communication in sampling processes. To address this challenge, we propose a missing data completion method based on group-sparse differential reweighted latent matrix factorization (GSDRMF). The proposed method mitigates the impact of noise and sparse outliers on the global piecewise smoothness of the data by incorporating Frobenius and sparse norm constraints, enabling more precise rank approximation. In the low-rank approximation phase, we employ Burer-Monteiro non-convex reweighted factorization to estimate the rank of the partially observed matrix. Simultaneously, leveraging temporal consistency, a group-sparse norm constraint is applied to the temporal gradient of the latent matrix. Finally, using a dual non-convex alternating direction method of multipliers optimization algorithm embedded with Fast Fourier Transform, the optimized latent variables are efficiently computed to meet optimality conditions while accelerating the overall computation speed. The proposed method is validated through its application to two real-world industrial processes, demonstrating its effectiveness in handling both non-uniform random and block missing data.
3. Z. Wang, J. Wu, R. Fan, W. Ke and L. Wang, "VPRF: Visual Perceptual Radiance Fields for Foveated Image Synthesis," in IEEE Transactions on Visualization and Computer Graphics, vol. 30, no. 11, pp. 7183-7192, 2024, doi: 10.1109/TVCG.2024.3456184.
  Abstract. Depth estimation extracting scenes' structural information is a key step in various light field(LF) applications. However, most existing depth estimation methods are based on the Lambertian assumption, which limits the application in non-Lambertian scenes. In this paper, we discover a unique transparent cheating problem for non-Lambertian scenes which can effectively spoof depth estimation algorithms based on photo consistency. It arises because the spatial consistency and the linear structure superimposed on the epipolar plane image form new spurious lines. Therefore, we propose centrifugal consistency and centripetal consistency for separating the depth information of multi-layer scenes and correcting the error due to the transparent cheating problem, respectively. By comparing the distributional characteristics and the number of minimal values of photo consistency and centrifugal consistency, non-Lambertian regions can be efficiently identified and initial depth estimates obtained. Then centripetal consistency is exploited to reject the projection from different layers and to address transparent cheating. By assigning decreasing weights radiating outward from the central view, pixels with a concentration of colors close to the central viewpoint are considered more significant. The problem of underestimating the depth of background caused by transparent cheating is effectively solved and corrected. Experiments on synthetic and real-world data show that our method can produce high-quality depth estimation under the transparency and the reflectivity of 90% to 20%. The proposed triple-consistency-based algorithm outperforms state-of-the-art LF depth estimation methods in terms of accuracy and robustness.
4. D. Yang, H. Sheng, S. Wang, S. Wang, Z. Xiong and W. Ke, "Boosting Light Field Spatial Super-Resolution via Masked Light Field Modeling," in IEEE Transactions on Computational Imaging, vol. 10, pp. 1317-1330, 2024, doi: 10.1109/TCI.2024.3451998.
  Abstract. Light field (LF) imaging benefits a wide range of applications with geometry information it captured. However, due to the restricted sensor resolution, LF cameras sacrifice spatial resolution for sufficient angular resolution. Hence LF spatial super-resolution (LFSSR), which highly relies on inter-intra view correlation extraction, is widely studied. In this paper, a self-supervised pre-training scheme, named masked LF modeling (MLFM), is proposed to boost the learning of inter-intra view correlation for better super-resolution performance. To achieve this, we first introduce a transformer structure, termed as LFormer, to establish direct inter-view correlations inside the 4D LF. Compared with traditional disentangling operations for LF feature extraction, LFormer avoids unnecessary loss in angular domain. Therefore it performs better in learning the cross-view mapping among pixels with MLFM pre-training. Then by cascading LFormers as encoder, LFSSR network LFormer-Net is designed, which comprehensively performs inter-intra view high-frequency information extraction. In the end, LFormer-Net is pre-trained with MLFM by introducing a Spatially-Random Angularly-Consistent Masking (SRACM) module. With a high masking ratio, MLFM pre-training effectively promotes the performance of LFormer-Net. Extensive experiments on public datasets demonstrate the effectiveness of MLFM pre-training and LFormer-Net. Our approach outperforms state-of-the-art LFSSR methods numerically and visually on both small- and large-disparity datasets.
5. Y. Luo, W. Ke, C.T. Lam and S.K. Im, "An accurate slicing method for dynamic time warping algorithm and the segment-level early abandoning optimization," in Knowledge-Based Systems, vol. 300, 2024, doi: 10.1016/j.knosys.2024.112231.
  Abstract. Time series data analysis algorithms have been gaining significant importance in the research community. Extensive studies have confirmed that Dynamic Time Warping (DTW) is the best distance measure in time series analysis across multiple domains. However, DTW is a time-consuming algorithm with quadratic time complexity, which limits its widespread adoption. In this paper, we proposed a novel slicing mechanism for DTW, called Partial Dynamic Time Warping (PDTW). PDTW is capable of dividing a complete DTW calculation into multiple independent partial calculations. The proposed PDTW ensures very consistent alignments with the original Constrained DTW (CDTW) by incorporating additional data windows for each pairwise time series. On the basis of PDTW, we also present Early Abandoning Dynamic Time Warping (EADTW) technique, which saves a substantial amount of computing time by abandoning superfluous calculations of unnecessary segments in the time series classification task. Large-scale experimental results on 96 datasets from the UCR archive show the effectiveness of PDTW and EADTW. The cumulative sum of PDTW distances for segments is nearly identical to the CDTW distance, and in many cases, it is exactly equal. This remarkable characteristic of PDTW indicates its immense potential for further development and optimization of DTW, such as parallelization. In classification tasks, EADTW is 1.99 times faster than CDTW on average, with a maximum speedup of 2.89 times, while maintaining accuracy. Additionally, it enhances accuracy by 0.93% and improves speed by 2.34 times under the accuracy priority parameter.
6. Z. Cui, H. Sheng, D. Yang, S. Wang, R. Chen and W. Ke, "Light Field Depth Estimation for Non-Lambertian Objects via Adaptive Cross Operator," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 2, pp. 1199-1211, Feb. 2024, doi: 10.1109/TCSVT.2023.3292884.
  Abstract. Light field (LF) depth estimation is a crucial basis for LF-related applications. Most existing methods are based on the Lambertian assumption and cannot deal with non-Lambertian surfaces represented by transparent objects and mirrors. In this paper, we propose a novel Adaptive-Cross-Operator-based(ACO) depth estimation algorithm for non-Lambertian LF. By analyzing the imaging characteristics of non-Lambertian regions, it is found that the difficulty of depth estimation lies in the photo inconsistency of the center view. Combining with the two-branch structure, we propose ACO with an inter-branch cooperation strategy to adaptively separate depth information with different reflectance coefficients. We discover that the bimodal distribution feature of the operator filtering results can assist in the separation of multi-layer scene information. The first detection branch filters the EPI and implicitly records the severity of multi-layer scene aliasing. According to the identification of bimodal distribution features, the non-Lambertian regions are marked out and the depth of the foreground is estimated. The second branch receives guidance from the first to dynamically adjust the inner weight and infer the background’s depth after weakening the interference from the foreground. Finally, the depth information separation of multi-layer scenes is achieved by extracting the unique X-shaped linear structure. Without the reflection coefficients of the non-Lambertian object, the proposed method can produce high-quality depth estimation under the transparency of 90% to 20%. Experimental results show that the proposed ACO outperforms state-of-the-art LF depth estimation methods in terms of accuracy and robustness.
7. Z. Wu, C. Wang, W. Zhang, G. Sun, W. Ke and Z. Xiong, "Online 3D behavioral tracking of aquatic model organism with a dual-camera system," in Advanced Engineering Informatics, vol. 61, 2024, doi: 10.1016/j.aei.2024.102481.
  Abstract. Behavioral tracking system of aquatic model organism is crucial for applications in aquaculture, environment and biomedicine, as it facilitates human to monitor subject states by automatically recognizing individual identities, and quantify their movement trajectories. Previous research has been devoted to this topic, but they are still not simple and effective enough. Therefore, this work introduces a novel online monitoring system implemented by dual-camera equipment and software modules consisting of an object detector and a multi-view multi-target tracker. The tracker provides the abilities of cross-view matching, underwater 3D reconstruction, and 3D target tracking. Specifically, our solution adopts a new paradigm, called tracking by early-reconstruction, which prioritizes the 3D reconstruction of targets’ coordinates on a frame-by-frame basis and then tracks them directly in 3D space rather than in a 2D image plane. This paradigm simplifies the complex multi-view tracking problem into a series of local association procedures, allowing us to achieve an online resolution through the iterative approach. To verify the effectiveness of the system, we employ zebrafish as the research subject, and evaluate the accuracy and robustness of the system on tracking benchmark, behavioral tasks and simulated data. Finally, we conducted extensive experiments and demonstrated the efficiency and effectiveness of the proposed system.
8. Q. Gu and W. Ke, "Typing Requirement Model as Coroutines," in IEEE Access, vol. 12, pp. 8449-8460, 2024, doi: 10.1109/ACCESS.2024.3352115.
  Abstract. Model-Driven Engineering (MDE) is a technique that aims to boost productivity in software development and ensure the safety of critical systems. Central to MDE is the refinement of high-level requirement models into executable code. Given that requirement models form the foundation of the entire development process, ensuring their correctness is crucial. RM2PT is a widely used MDE platform that employs the REModel language for requirement modeling. REModel contains contract sections and other sections including a UML sequence diagram. This paper contributes a coroutine-based type system that represents pre- and post-conditions in the contract sections in a requirement model as the receiving and yielding parts of coroutines, respectively. The type system is capable of composing coroutine types, so that users can view functions as a whole system and check their collective behavior. By doing so, our type system ensures that the contracts defined in it are executed as outlined in the accompanied sequence diagram. We assessed our approach using four case studies provided by RM2PT, validating the accuracy of the models.
9. L.M. Hoi, W. Ke and S.K. Im, "Manipulating Data Lakes Intelligently With Java Annotations," in IEEE Access, vol. 12, pp. 34903-34917, 2024, doi: 10.1109/ACCESS.2024.3372618.
  Abstract. Data lakes are typically large data repositories where enterprises store data in a variety of data formats. From the perspective of data storage, data can be categorized into structured, semi-structured, and unstructured data. On the one hand, due to the complexity of data forms and transformation procedures, many enterprises simply pour valuable data into data lakes without organizing and managing them effectively. This can create data silos (or data islands) or even data swamps, with the result that some data will be permanently invisible. Although data are integrated into a data lake, they are simply physically stored in the same environment and cannot be correlated with other data to leverage their precious value. On the other hand, processing data from a data lake into a desired format is always a difficult and tedious task that requires experienced programming skills, such as conversion from structured to semi-structured. In this article, a novel software framework called Java Annotation for Manipulating Data Lakes (JAMDL) that can manage heterogeneous data is proposed. This approach uses Java annotations to express the properties of data in metadata (data about data) so that the data can be converted into different formats and managed efficiently in a data lake. Furthermore, this article suggests using artificial intelligence (AI) translation models to generate Data Manipulation Language (DML) operations for data manipulation and uses AI recommendation models to improve the visibility of data when data precipitation occurs.
10. S. Wang, D. Yang, H. Sheng, J. Shen, Y. Zhang and W. Ke, "A Blockchain-Enabled Distributed System for Trustworthy and Collaborative Intelligent Vehicle Re-Identification," in IEEE Transactions on Intelligent Vehicles, vol. 9, no. 2, pp. 3271-3282, Feb. 2024, doi: 10.1109/TIV.2023.3347267.
  Abstract. Vehicle re-identification (ReID) is a hot topic in intelligent city surveillance. With the development of smart cameras and vehicular edge computing (VEC), numerous media data has opened up new possibilities for enhancing the applications of vehicle ReID. However, traditional vehicle re-identification systems face the following challenges: 1) it is difficult to recognize the identities of the vehicles in various views and similar appearance, 2) the current system is hard to be extended to large-scale of cameras in a low-trust VEC environment. To solve these problems, we propose a Blockchain-based Collaborative Vehicle ReID (BCV-ReID) system in this paper. It contains two core parts including Viewpoint-identity Query Net (VQNet) and VehicleChain (VChain). By utilizing the viewpoint information and local details simultaneously, VQNet can distinguish the vehicle identities in various cross-camera scenes. It employs viewpoint queries and spatial self-attention to learn the inherent correlation of the vehicle parts, enhancing the ability to distinguish vehicles among various viewpoints. Then, we integrate VChain with VQNet to realize a collaborative vehicle ReID system. The ReID task is illustrated from the perspective of blockchain transactions. All transactions are validated by a deeply integrated ReID consensus to counter potential malicious attacks. Experiments show that the proposed method achieves comparable results in three famous ReID datasets, as well as outstanding performance in real applications.
11. C. Wang, Z. Wu, Y. Chen, W. Zhang, W. Ke and Z. Xiong, "Improving 3-D Zebrafish Tracking With Multiview Data Fusion and Global Association," in IEEE Sensors Journal, vol. 23, no. 15, pp. 17245-17259, 1 Aug.1, 2023, doi: 10.1109/JSEN.2023.3288729.
  Abstract. Zebrafish behavioral patterns reveal valuable insights for biomedical research. To accurately identify these patterns, visual tracking systems need to reconstruct 3-D trajectories from multiview video sequences. However, 3-D zebrafish tracking faces challenges such as the dynamics in movements, the similarity in appearances, and the distortion caused by different viewpoints. In this article, we propose a new method for robust 3-D zebrafish trajectory reconstruction based on multiview data fusion and global association. Our method generates reliable segments of 2-D/3-D trajectories, called tracklets, where we consider short-term cues of appearance similarity and motion consistency and propose corresponding scoring metrics. Moreover, we use a lazy-reconstruction strategy to enhance the overall accuracy of 3-D trajectories by taking into account the global context. Extensive experiments on the public 3D-ZeF20 dataset demonstrate the effectiveness of the proposed method, achieving 67.9% multiple object tracking accuracy (MOTA), 64.3% ID F1 Score (IDF1), and 55.0 MTBFm.
12. Z. Cui, H. Sheng, D. Yang, S. Wang, R. Chen and W. Ke, "Light Field Depth Estimation For Non-Lambertian Objects via Adaptive Cross Operator," in IEEE Transactions on Circuits and Systems for Video Technology, doi: 10.1109/TCSVT.2023.3292884.
  Abstract. Light field (LF) depth estimation is a crucial basis for LF-related applications. Most existing methods are based on the Lambertian assumption and cannot deal with non-Lambertian surfaces represented by transparent objects and mirrors. In this paper, we propose a novel Adaptive-Cross-Operator-based(ACO) depth estimation algorithm for non-Lambertian LF. By analyzing the imaging characteristics of non-Lambertian regions, it is found that the difficulty of depth estimation lies in the photo inconsistency of the center view. Combining with the two-branch structure, we propose ACO with an inter-branch cooperation strategy to adaptively separate depth information with different reflectance coefficients. We discover that the bimodal distribution feature of the operator filtering results can assist in the separation of multi-layer scene information. The first detection branch filters the EPI and implicitly records the severity of multi-layer scene aliasing. According to the identification of bimodal distribution features, the non-Lambertian regions are marked out and the depth of the foreground is estimated. The second branch receives guidance from the first to dynamically adjust the inner weight and infer the background’s depth after weakening the interference from the foreground. Finally, the depth information separation of multi-layer scenes is achieved by extracting the unique X-shaped linear structure. Without the reflection coefficients of the non-Lambertian object, the proposed method can produce high-quality depth estimation under the transparency of 90% to 20%. Experimental results show that the proposed ACO outperforms state-of-the-art LF depth estimation methods in terms of accuracy and robustness.
13. Y. Wu, H. Sheng, Y. Zhang, S. Wang, Z. Xiong and W. Ke, "Hybrid Motion Model for Multiple Object Tracking in Mobile Devices," in IEEE Internet of Things Journal, vol. 10, no. 6, pp. 4735-4748, 15 March15, 2023, doi: 10.1109/JIOT.2022.3219627.
  Abstract. For an intelligent transportation system, multiple object tracking (MOT) is more challenging from the traditional static surveillance camera to mobile devices of the Internet of Things (IoT). To cope with this problem, previous works always rely on additional information from multivision, various sensors, or precalibration. Only based on a monocular camera, we propose a hybrid motion model to improve the tracking accuracy in mobile devices. First, the model evaluates camera motion hypotheses by measuring optical flow similarity and transition smoothness to perform robust camera trajectory estimation. Second, along the camera trajectory, smooth dynamic projection is used to map objects from image to world coordinate. Third, to deal with trajectory motion inconsistency, which is caused by occlusion and interaction of long time interval, tracklet motion is described by the multimode motion filter for adaptive modeling. Fourth, in tracklets association, we propose a spatiotemporal evaluation mechanism, which achieves higher discriminability in motion measurement. Experiments on MOT15, MOT17, and KITTI benchmarks show that our proposed method improves the trajectory accuracy, especially in mobile devices and our method achieves competitive results over other state-of-the-art methods.
14. X. Jiang, Y. Xu, W. Ke, Y. Zhang, Q. -X. Zhu and Y.L. He, "An Imbalanced Multifault Diagnosis Method Based on Bias Weights AdaBoost," in IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1-8, 2022, Art no. 3505908, doi: 10.1109/TIM.2022.3149097.
  Abstract. Fault diagnosis plays an important role in ensuring process safety. It is noted that imbalance between fault data and normal data always exists, and multifault obviously outranges a single fault in common, which leads to more challenges to fault diagnosis. In this article, an imbalanced multifault diagnosis method based on bias weights AdaBoost (BW-AdaBoost) is proposed. First, majority normal samples are under-sampled by K-nearest neighbor (KNN) to collect the boundary samples between majority normal samples and minority fault samples, and then the bias datasets are formed by under-sampled majority samples and minority samples. Second, different weak classifiers with adaptive weights are constructed based on the bias datasets and are integrated into a strong classifier, which is taken as the base classifier. While constructing the weak classifier, higher weights are given to the items corresponding to the minority class in the loss function to enhance the influence of minority class samples. Third, to solve the multifault problem, the base classifiers are integrated into a multiclassification model by the hierarchical structure which needs fewer classifiers and less computational expense. Finally, through simulation experiment, the comparison results show that the proposed imbalanced multifault diagnosis method based on BW-AdaBoost can effectively improve the diagnosis accuracy and F1 score.
15. K.H. Chan, W. Ke and S.K. Im, "A General Method for Generating Discrete Orthogonal Matrices," in IEEE Access, vol. 9, pp. 120380-120391, 2021, doi: 10.1109/ACCESS.2021.3107579.
  Abstract. Discrete orthogonal matrices have applications in information coding and cryptography. It is often challenging to generate discrete orthogonal matrices. A common approach widely in use is to discretize continuous orthogonal functions that have been discovered. The need of such continuous functions is restrictive. Polynomials, as the simplest class of continuous functions, are widely studied for their orthogonality, to serve the purpose of generating orthogonal matrices. However, beginning with continuous orthogonal polynomials still takes much work. To overcome this complexity while improving the efficiency and flexibility, we present a general method for generating orthogonal matrices directly through the construction of certain even and odd polynomials from a set of distinct positive values, bypassing the need of continuous orthogonal functions. We present a constructive proof by induction that not only asserts the existence of such polynomials, but also tells how to iteratively construct them. Besides the derivation of the method as simple as a few nested loops, we discuss two well-known discrete transforms, the Discrete Cosine Transform and the Discrete Tchebichef Transform, about how they can be achieved using our method with the specific values, and how to embed them into the transform module of video coding. By the same token, we also give the examples for generating new orthogonal matrices from arbitrarily chosen values. The demonstrative experiments indicate that our method is not only simpler to implement, but also more efficient and flexible. It can generate orthogonal matrices of larger sizes, compared with those existing methods.
16. W. Ke and K.H. Chan, "Pattern Matching Based on Object Graphs," in IEEE Access, vol. 9, pp. 159313-159325, 2021, doi: 10.1109/ACCESS.2021.3128575.
  Abstract. Pattern matching has been widely adopted in functional programming languages, and is gradually getting popular in OO languages, from Scala to Python. The structural pattern matching currently in use has its foundation on algebraic data types from functional languages. To better reflect the pointer structures of OO programs, we propose a pattern matching extension to general statically typed OO languages based on object graphs. By this extension, we support patterns having aliasing and circular referencing, that are typically found in pointer structures. With the requirement of only an abstract subtyping preorder on types, our extension is not restricted to a particular hierarchical class model. We give the formal base of the graph model, that is able to handle aliases and cycles in patterns, together with the abstract syntax to construct the object graphs. More complex cases of conjunction and disjunction of multiple patterns are explored with resolution. We present the type checking rules and operational semantics to reason about the soundness by proving the type safety. We also discuss the design decisions, applicability and limitation of our pattern matching extension.
17. H. Sheng et al., "Combining Pose Invariant and Discriminative Features for Vehicle Reidentification," in IEEE Internet of Things Journal, vol. 8, no. 5, pp. 3189-3200, 1 March1, 2021, doi: 10.1109/JIOT.2020.3015239.
  Abstract. Vehicle reidentification, aiming at identifying vehicles across images, has drawn a lot of attention and has made significant achievements in recent years. However, vehicle reidentification remains a challenging task caused by severe appearance changes due to different orientations. In practice, the result of reidentification is greatly influenced by the pose of vehicles, and we call this influence as a pose barrier problem. One way to address the pose barrier problem is to train a feature representation that is invariant for various vehicle poses. To this end, we present pose robust features (PRFs) that contains two components: 1) pose-invariant features (PIFs) and 2) pose discriminative features (PDFs). On the one hand, PIF is the expert in exploring the overall characteristic of vehicles. When training PIF, we adopt an identity classifier as well as an orientation classifier. In addition, an adversarial loss is deployed in the PIF network. On the other hand, we design a PDF network, which has a similar architecture to the PIF network but can distinguish the difference between local details. The difference between PDF and PIF is that the network of training PDF does not apply the adversarial loss. Finally, by combining PIF and PDF, PRF has the advantages of the two features and can alleviate the influence of the pose barrier problem. Experiments are conducted on the VeRi-776 and VehicleID data sets. We show that PIF and PDF are complementary and that PRF produces competitive performance compared with state-of-the-art approaches.
18. Y. Zhang, H. Sheng, Y. Wu, S. Wang, W. Ke and Z. Xiong, "Multiplex Labeling Graph for Near-Online Tracking in Crowded Scenes," in IEEE Internet of Things Journal, vol. 7, no. 9, pp. 7892-7902, Sept. 2020, doi: 10.1109/JIOT.2020.2996609.
  Abstract. In recent years, the demand for intelligent devices related to the Internet of Things (IoT) is rapidly increasing. In the field of computer vision, many algorithms have been preinstalled in IoT devices to achieve higher efficiency, such as face recognition, area detection, target tracking, etc. Tracking is an important but complex task that needs high efficiency solutions in real applications. There is a common assumption that detection can only represent one pedestrian to describe nonoverlapping in physical space. In fact, the pixels of the image do not exactly correspond to the positions in the real world. In order to overcome the limitation of this assumption, we remove this unreasonable assumption and present a novel idea that each detector response can have multiple labels to describe different targets at the same time. Therefore, we propose a graph-based method for near-online tracking in this article. We introduce a detection multiplexing method for tracking in the monocular image and propose a multiplex labeling graph (MLG) model. Each node in MLG has the ability to represent multiple targets. In addition, we improve the shortage of graph-based trackers in using temporal features. We construct long short-term memory networks to model motion and appearance features for MLG optimization. On the public multiobject tracking challenge benchmark, our near-online method gains satisfactory efficiency and achieves state-of-the-art results without additional private detection as well.
19. Y. Zhang, H. Sheng, Y. Wu, S. Wang, W. Lyu; W. Ke and Z. Xiong, "Long-Term Tracking With Deep Tracklet Association," in IEEE Transactions on Image Processing, vol. 29, pp. 6694-6706, 2020, doi: 10.1109/TIP.2020.2993073.
  Abstract. Recently, most multiple object tracking (MOT) algorithms adopt the idea of tracking-by-detection. Relevant research shows that the performance of the detector obviously affects the tracker, while the improvement of detector is gradually slowing down in recent years. Therefore, trackers using tracklet (short trajectory) are proposed to generate more complete trajectories. Although there are various tracklet generation algorithms, the fragmentation problem still often occurs in crowded scenes. In this paper, we introduce an iterative clustering method that generates more tracklets while maintaining high confidence. Our method shows robust performance on avoiding internal identity switch. Then we propose a deep association method for tracklet association. In terms of motion and appearance, we construct motion evaluation network (MEN) and appearance evaluation network (AEN) to learn long-term features of tracklets for association. In order to explore more robust features of tracklets, a tracklet-based training mechanism is also introduced. Tracklet groups are used as the input of the networks instead of discrete detections. Experimental results show that our training method enhances the performance of the networks. In addition, our tracking framework generates more complete trajectories while maintaining the unique identity of each target as the same time. On the latest MOT 2017 benchmark, we achieve state-of-the-art results.
20. Y. Yang, X. Li, W. Ke and Z. Liu, "Automated Prototype Generation From Formal Requirements Model," in IEEE Transactions on Reliability, vol. 69, no. 2, pp. 632-656, June 2020, doi: 10.1109/TR.2019.2934348.
  Abstract. Prototyping is an effective and efficient way of requirements validation to avoid introducing errors in the early stage of software development. However, manually developing a prototype of a software system requires additional efforts, which would increase the overall cost of software development. In this article, we present an approach with a developed tool RM2PT to automated prototype generation from formal requirements models for requirements validation. A requirements model consists of a use case diagram, a conceptual class diagram, use case definitions specified by system sequence diagrams, and the contracts of their system operations. A system operation contract is formally specified by a pair of pre and postconditions in object constraint language. We propose a method with a set of transformation rules to decompose a contract into executable parts and nonexecutable parts. An executable part can be automatically transformed into a sequence of primitive operations by applying their corresponding rules, and a nonexecutable part is not transformable with the rules. The tool RM2PT provides a mechanism for developers to develop a piece of program for each nonexecutable part manually, which can be plugged into the generated prototype source code automatically. We have conducted four case studies with over 50 use cases. The experimental result shows that the 93.65% system operations are executable, and only 6.35% are nonexecutable, which can be implemented by developers manually or invoking the third-party application programming interface (APIs). Overall, the result is satisfactory. Each 1 s generated prototype of four case studies requires approximate one day's manual implementation by a skilled programmer. The proposed approach with the developed computer-aided software engineering tool can be applied to the software industry for requirements engineering.

Current Employer/Organization

Current Position

Subjects Taught

Education

PhD in Computer Applied Technology, School of Computer Science and Engineering, Beihang University, Beijing, China

MSc in Computer Software and Theory, Institute of Software, Chinese Academy of Sciences, Beijing, China

BSc in Computer Software, Department of Computer Science, Sun Yat-sen University, Guangzhou, Guangdong, China

Research Interests

Journal papers