1 Characterization of Magnetic Labyrinthine Structures through Junctions and Terminals Detection using Template Matching and CNN In material sciences, characterizing faults in periodic structures is vital for understanding material properties. To characterize magnetic labyrinthine patterns, it is necessary to accurately identify junctions and terminals, often featuring over a thousand closely packed defects per image. This study introduces a new technique called TM-CNN (Template Matching - Convolutional Neural Network) designed to detect a multitude of small objects in images, such as defects in magnetic labyrinthine patterns. TM-CNN was used to identify these structures in 444 experimental images, and the results were explored to deepen the understanding of magnetic materials. It employs a two-stage detection approach combining template matching, used in initial detection, with a convolutional neural network, used to eliminate incorrect identifications. To train a CNN classifier, it is necessary to create a large number of training images. This difficulty prevents the use of CNN in many practical applications. TM-CNN significantly reduces the manual workload for creating training images by automatically making most of the annotations and leaving only a small number of corrections to human reviewers. In testing, TM-CNN achieved an impressive F1 score of 0.988, far outperforming traditional template matching and CNN-based object detection algorithms. 4 authors · Jan 29, 2024
- Volumetric Wireframe Parsing from Neural Attraction Fields The primal sketch is a fundamental representation in Marr's vision theory, which allows for parsimonious image-level processing from 2D to 2.5D perception. This paper takes a further step by computing 3D primal sketch of wireframes from a set of images with known camera poses, in which we take the 2D wireframes in multi-view images as the basis to compute 3D wireframes in a volumetric rendering formulation. In our method, we first propose a NEural Attraction (NEAT) Fields that parameterizes the 3D line segments with coordinate Multi-Layer Perceptrons (MLPs), enabling us to learn the 3D line segments from 2D observation without incurring any explicit feature correspondences across views. We then present a novel Global Junction Perceiving (GJP) module to perceive meaningful 3D junctions from the NEAT Fields of 3D line segments by optimizing a randomly initialized high-dimensional latent array and a lightweight decoding MLP. Benefitting from our explicit modeling of 3D junctions, we finally compute the primal sketch of 3D wireframes by attracting the queried 3D line segments to the 3D junctions, significantly simplifying the computation paradigm of 3D wireframe parsing. In experiments, we evaluate our approach on the DTU and BlendedMVS datasets with promising performance obtained. As far as we know, our method is the first approach to achieve high-fidelity 3D wireframe parsing without requiring explicit matching. 6 authors · Jul 14, 2023
1 CONFLATOR: Incorporating Switching Point based Rotatory Positional Encodings for Code-Mixed Language Modeling The mixing of two or more languages is called Code-Mixing (CM). CM is a social norm in multilingual societies. Neural Language Models (NLMs) like transformers have been effective on many NLP tasks. However, NLM for CM is an under-explored area. Though transformers are capable and powerful, they cannot always encode positional information since they are non-recurrent. Therefore, to enrich word information and incorporate positional information, positional encoding is defined. We hypothesize that Switching Points (SPs), i.e., junctions in the text where the language switches (L1 -> L2 or L2 -> L1), pose a challenge for CM Language Models (LMs), and hence give special emphasis to SPs in the modeling process. We experiment with several positional encoding mechanisms and show that rotatory positional encodings along with switching point information yield the best results. We introduce CONFLATOR: a neural language modeling approach for code-mixed languages. CONFLATOR tries to learn to emphasize switching points using smarter positional encoding, both at unigram and bigram levels. CONFLATOR outperforms the state-of-the-art on two tasks based on code-mixed Hindi and English (Hinglish): (i) sentiment analysis and (ii) machine translation. 8 authors · Sep 11, 2023
18 Boundary Attention: Learning to Find Faint Boundaries at Any Resolution We present a differentiable model that explicitly models boundaries -- including contours, corners and junctions -- using a new mechanism that we call boundary attention. We show that our model provides accurate results even when the boundary signal is very weak or is swamped by noise. Compared to previous classical methods for finding faint boundaries, our model has the advantages of being differentiable; being scalable to larger images; and automatically adapting to an appropriate level of geometric detail in each part of an image. Compared to previous deep methods for finding boundaries via end-to-end training, it has the advantages of providing sub-pixel precision, being more resilient to noise, and being able to process any image at its native resolution and aspect ratio. 6 authors · Jan 1, 2024
2 Deep Neuromorphic Networks with Superconducting Single Flux Quanta Conventional semiconductor-based integrated circuits are gradually approaching fundamental scaling limits. Many prospective solutions have recently emerged to supplement or replace both the technology on which basic devices are built and the architecture of data processing. Neuromorphic circuits are a promising approach to computing where techniques used by the brain to achieve high efficiency are exploited. Many existing neuromorphic circuits rely on unconventional and useful properties of novel technologies to better mimic the operation of the brain. One such technology is single flux quantum (SFQ) logic -- a cryogenic superconductive technology in which the data are represented by quanta of magnetic flux (fluxons) produced and processed by Josephson junctions embedded within inductive loops. The movement of a fluxon within a circuit produces a quantized voltage pulse (SFQ pulse), resembling a neuronal spiking event. These circuits routinely operate at clock frequencies of tens to hundreds of gigahertz, making SFQ a natural technology for processing high frequency pulse trains. Prior proposals for SFQ neural networks often require energy-expensive fluxon conversions, involve heterogeneous technologies, or exclusively focus on device level behavior. In this paper, a design methodology for deep single flux quantum neuromorphic networks is presented. Synaptic and neuronal circuits based on SFQ technology are presented and characterized. Based on these primitives, a deep neuromorphic XOR network is evaluated as a case study, both at the architectural and circuit levels, achieving wide classification margins. The proposed methodology does not employ unconventional superconductive devices or semiconductor transistors. The resulting networks are tunable by an external current, making this proposed system an effective approach for scalable cryogenic neuromorphic computing. 4 authors · Sep 21, 2023
- Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling Recently, there has been growing interest in developing learning-based methods to detect and utilize salient semi-global or global structures, such as junctions, lines, planes, cuboids, smooth surfaces, and all types of symmetries, for 3D scene modeling and understanding. However, the ground truth annotations are often obtained via human labor, which is particularly challenging and inefficient for such tasks due to the large number of 3D structure instances (e.g., line segments) and other factors such as viewpoints and occlusions. In this paper, we present a new synthetic dataset, Structured3D, with the aim of providing large-scale photo-realistic images with rich 3D structure annotations for a wide spectrum of structured 3D modeling tasks. We take advantage of the availability of professional interior designs and automatically extract 3D structures from them. We generate high-quality images with an industry-leading rendering engine. We use our synthetic dataset in combination with real images to train deep networks for room layout estimation and demonstrate improved performance on benchmark datasets. 6 authors · Aug 1, 2019
- Junction Tree Variational Autoencoder for Molecular Graph Generation We seek to automate the design of molecules based on specific chemical properties. In computational terms, this task involves continuous embedding and generation of molecular graphs. Our primary contribution is the direct realization of molecular graphs, a task previously approached by generating linear SMILES strings instead of graphs. Our junction tree variational autoencoder generates molecular graphs in two phases, by first generating a tree-structured scaffold over chemical substructures, and then combining them into a molecule with a graph message passing network. This approach allows us to incrementally expand molecules while maintaining chemical validity at every step. We evaluate our model on multiple tasks ranging from molecular generation to optimization. Across these tasks, our model outperforms previous state-of-the-art baselines by a significant margin. 3 authors · Feb 12, 2018