Which Training Methods for GANs do actually Converge? We include a memory-efficient, pure Python implementation of the locally masked convolution, as well as training and evaluation code. We use optional third-party analytics cookies to understand how you use GitHub.com so we can build better products. PS: I cannot annotate all the papers I read, but if I liked one, then that will be uploaded here. Deep Code Search ICSE ’18, May 27-June 3, 2018, Gothenburg, Sweden 3 4 7 5 1 5 2 0 8 3 2 4 h 0 h 1 h 2 h 3 max pooling with 1h 4 window size 7 5 8 Figure 2: Illustration of max pooling where [a;b]∈R2d represents the concatenation of two vectors,W∈ R2d×d is the matrix of trainable parameters in the RNN, while tanh is a non-linearity activation function of the RNN. GitHub, GitLab or BitBucket URL: * Official code from paper authors Submit Remove a code repository from this paper × richzhang/colorization ... Papers With Code … they're used to gather information about the pages you visit and how many clicks you need to accomplish a task. Contribute to FroyoZzz/CV-Papers-Codes development by creating an account on GitHub. Locally Masked Convolution for Autoregressive Models. If you're writing a research paper in computer science or another technical discipline, you may want to include source code in your research sources, such as code you find in a GitHub repository. Learn more. Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world. 72 benchmarks 1376 papers with code Image Classification. Tip: you can also follow us on Twitter Sorted by stars. Use this thread to request us your favorite conference to be added to our watchlist and to PWC list. Microsoft: New VS Code update is out – plus here's what GitHub Codespaces will cost. You signed in with another tab or window. having a link in the paper to an online copy (on their website or a public repository like github). 100. Usually the authors decide to make the code open source, if they have developed a tool or programming framework sort of thing. If you are using Python, this means providing a requirements.txt file (if using pip and virtualenv), providing environment.yml file (if using anaconda), or a setup.pyif your code is a library. they're used to gather information about the pages you visit and how many clicks you need to accomplish a task. It is good practice to provide a section in your README.md that explains how to install these dependencies. Tingran Gao, Shahar Kovalsky, Doug Boyer, and Ingrid Daubechies SIAM Journal on Mathematics of Data Science 2018. Sometimes I put a paper on hold and read it after a while. Papers with code. For more information, see our Privacy Statement. Portals About Log In/Register; Get the weekly digest × Get the latest machine learning methods with code. Xception: Deep Learning With Depthwise Separable Convolutions, Action-Decision Networks for Visual Tracking With Deep Reinforcement Learning, Image-To-Image Translation With Conditional Adversarial Networks, Quality Aware Network for Set to Set Recognition, Self-Supervised Learning of Visual Features Through Embedding Images Into Text Topic Spaces, Escape From Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models, A Distributional Perspective on Reinforcement Learning, Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks, Deep Transfer Learning with Joint Adaptation Networks, Training Deep Networks without Learning Rates Through Coin Betting, Full Resolution Image Compression With Recurrent Neural Networks, SurfaceNet: An End-To-End 3D Neural Network for Multiview Stereopsis, Doubly Stochastic Variational Inference for Deep Gaussian Processes, TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals, Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-Identification, Synthesizing 3D Shapes via Modeling Multi-View Depth Maps and Silhouettes With Deep Generative Networks, Borrowing Treasures From the Wealthy: Deep Transfer Learning Through Selective Joint Fine-Tuning, Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes, ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching, Differentiable Learning of Logical Rules for Knowledge Base Reasoning, Person Search With Natural Language Description, Multi-Channel Weighted Nuclear Norm Minimization for Real Color Image Denoising, Unsupervised Learning by Predicting Noise, Localizing Moments in Video With Natural Language, End-To-End 3D Face Reconstruction With Deep Neural Networks, CoupleNet: Coupling Global Structure With Local Parts for Object Detection, A Deep Regression Architecture With Two-Stage Re-Initialization for High Performance Facial Landmark Detection, Modeling Relationships in Referential Expressions With Compositional Modular Networks, Curiosity-driven Exploration by Self-supervised Prediction, Wavelet-SRNet: A Wavelet-Based CNN for Multi-Scale Face Super Resolution, The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process, Online and Linear-Time Attention by Enforcing Monotonic Alignments, Factorized Bilinear Models for Image Recognition, Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee, On-the-fly Operation Batching in Dynamic Computation Graphs, Visual Translation Embedding Network for Visual Relation Detection, A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning, Towards Diverse and Natural Image Descriptions via a Conditional GAN, CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos, A Generic Deep Architecture for Single Image Reflection Removal and Image Smoothing, Deep IV: A Flexible Approach for Counterfactual Prediction, EAST: An Efficient and Accurate Scene Text Detector, SST: Single-Stream Temporal Action Proposals, Predicting Deeper Into the Future of Semantic Segmentation, L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space, TALL: Temporal Activity Localization via Language Query, Hybrid Reward Architecture for Reinforcement Learning, Modulating early visual processing by language, Adversarial Examples for Semantic Segmentation and Object Detection, Learning Discrete Representations via Information Maximizing Self-Augmented Training, Efficient Diffusion on Region Manifolds: Recovering Small Objects With Compact CNN Representations, Real Time Image Saliency for Black Box Classifiers, FC4: Fully Convolutional Color Constancy With Confidence-Weighted Pooling, Multiple People Tracking by Lifted Multicut and Person Re-Identification, Learned D-AMP: Principled Neural Network based Compressive Image Recovery, GP CaKe: Effective brain connectivity with causal kernels, Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network, Semantic Video CNNs Through Representation Warping, EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis, Safe Model-based Reinforcement Learning with Stability Guarantees, Semantic Compositional Networks for Visual Captioning, On-Demand Learning for Deep Image Restoration, Stabilizing Training of Generative Adversarial Networks through Regularization, Structured Bayesian Pruning via Log-Normal Multiplicative Noise, Deriving Neural Architectures from Sequence and Graph Kernels, Masked Autoregressive Flow for Density Estimation, Learning Residual Images for Face Attribute Manipulation, Learning to Generate Long-term Future via Hierarchical Prediction, Accurate Optical Flow via Direct Cost Volume Processing, Generalized Orderless Pooling Performs Implicit Salient Matching, Comparative Evaluation of Hand-Crafted and Learned Local Features, SchNet: A continuous-filter convolutional neural network for modeling quantum interactions, Temporal Generative Adversarial Nets With Singular Value Clipping, Multiplicative Normalizing Flows for Variational Bayesian Neural Networks, Semantic Image Inpainting With Deep Generative Models, A Linear-Time Kernel Goodness-of-Fit Test, Least Squares Generative Adversarial Networks, Diversified Texture Synthesis With Feed-Forward Networks, No Fuss Distance Metric Learning Using Proxies, Template Matching With Deformable Diversity Similarity, What's in a Question: Using Visual Questions as a Form of Supervision, Face Normals "In-The-Wild" Using Fully Convolutional Networks, Conditional Image Synthesis with Auxiliary Classifier GANs, 3D-PRNN: Generating Shape Primitives With Recurrent Neural Networks, Structured Embedding Models for Grouped Data, Unified Deep Supervised Domain Adaptation and Generalization, Transformation-Grounded Image Generation Network for Novel 3D View Synthesis, Structured Attentions for Visual Question Answering, Geometric Loss Functions for Camera Pose Regression With Deep Learning, VidLoc: A Deep Spatio-Temporal Model for 6-DoF Video-Clip Relocalization, QMDP-Net: Deep Learning for Planning under Partial Observability, Hierarchical Boundary-Aware Neural Encoder for Video Captioning, Unsupervised Learning of Disentangled Representations from Video, Deep Learning on Lie Groups for Skeleton-Based Action Recognition, Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection, 3D Point Cloud Registration for Localization Using a Deep Neural Network Auto-Encoder, StyleNet: Generating Attractive Visual Captions With Styles, Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon, Continual Learning Through Synaptic Intelligence, Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes, Learning Detection With Diverse Proposals, LCNN: Lookup-Based Convolutional Neural Network, Towards Accurate Multi-Person Pose Estimation in the Wild, Real-Time Neural Style Transfer for Videos, Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training, Deep Co-Occurrence Feature Learning for Visual Object Recognition, Joint distribution optimal transportation for domain adaptation, Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields, SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization, A Unified Approach of Multi-Scale Deep and Hand-Crafted Features for Defocus Estimation, Learning Spread-Out Local Feature Descriptors, DropoutNet: Addressing Cold Start in Recommender Systems, Phrase Localization and Visual Relationship Detection With Comprehensive Image-Language Cues, Harvesting Multiple Views for Marker-Less 3D Human Pose Annotations, Deep 360 Pilot: Learning a Deep Agent for Piloting Through 360deg Sports Videos, Neural Message Passing for Quantum Chemistry, State-Frequency Memory Recurrent Neural Networks, DeepCD: Learning Deep Complementary Descriptors for Patch Representations, Contrastive Learning for Image Captioning, Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure, Learning High Dynamic Range From Outdoor Panoramas, Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors, Learning to Detect Salient Objects With Image-Level Supervision, Improved Variational Autoencoders for Text Modeling using Dilated Convolutions, Interspecies Knowledge Transfer for Facial Keypoint Detection, Long Short-Term Memory Kalman Filters: Recurrent Neural Estimators for Pose Regularization, Temporal Context Network for Activity Localization in Videos, Incremental Learning of Object Detectors Without Catastrophic Forgetting, Dense Captioning With Joint Inference and Visual Context, Asymmetric Tri-training for Unsupervised Domain Adaptation, Reducing Reparameterization Gradient Variance, Exploiting Saliency for Object Segmentation From Image Level Labels, A Dirichlet Mixture Model of Hawkes Processes for Event Sequence Clustering, Straight to Shapes: Real-Time Detection of Encoded Shapes, Dual Discriminator Generative Adversarial Nets, Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net, Learning Spherical Convolution for Fast Features from 360° Imagery, Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier, When Unsupervised Domain Adaptation Meets Tensor Representations, Image Super-Resolution Using Dense Skip Connections, Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer, STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling, Learning Continuous Semantic Representations of Symbolic Expressions, Combined Group and Exclusive Sparsity for Deep Neural Networks, Hash Embeddings for Efficient Word Representations, Accuracy First: Selecting a Differential Privacy Level for Accuracy Constrained ERM, Disentangled Representation Learning GAN for Pose-Invariant Face Recognition, Learning to Pivot with Adversarial Networks, Learning Dynamic Siamese Network for Visual Object Tracking, POSEidon: Face-From-Depth for Driver Pose Estimation, Deep Metric Learning via Facility Location, Automatic Spatially-Aware Fashion Concept Discovery, From Motion Blur to Motion Flow: A Deep Learning Solution for Removing Heterogeneous Motion Blur, Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks, Zero-Inflated Exponential Family Embeddings, InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations, Weakly-Supervised Learning of Visual Relations, Multi-Label Image Recognition by Recurrently Discovering Attentional Regions, Scene Parsing With Global Context Embedding, Deep Mean-Shift Priors for Image Restoration, Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition, Fully-Adaptive Feature Sharing in Multi-Task Networks With Applications in Person Attribute Classification, Structured Generative Adversarial Networks, Joint Gap Detection and Inpainting of Line Drawings, Chained Multi-Stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection, Adversarial Feature Matching for Text Generation, BIER - Boosting Independent Embeddings Robustly, Predictive-Corrective Networks for Action Detection, A Bayesian Data Augmentation Approach for Learning Deep Models, Attentive Semantic Video Generation Using Captions, MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network, Deep Unsupervised Similarity Learning Using Partially Ordered Sets, DualNet: Learn Complementary Features for Image Recognition, Neural system identification for large populations separating “what” and “where”, FALKON: An Optimal Large Scale Kernel Method, Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks, Deep Learning with Topological Signatures, Streaming Sparse Gaussian Process Approximations, RPAN: An End-To-End Recurrent Pose-Attention Network for Action Recognition in Videos, Awesome Typography: Statistics-Based Text Effects Transfer, RoomNet: End-To-End Room Layout Estimation, Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval, Few-Shot Learning Through an Information Retrieval Lens, Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach, Learning to Push the Limits of Efficient FFT-Based Image Deconvolution, Deep Multitask Architecture for Integrated 2D and 3D Human Sensing, Estimating Mutual Information for Discrete-Continuous Mixtures, Spatially-Varying Blur Detection Based on Multiscale Fused and Sorted Transform Coefficients of Gradient Magnitudes, StyleBank: An Explicit Representation for Neural Image Style Transfer, Automatic Discovery of the Statistical Types of Variables in a Dataset, Learning Proximal Operators: Using Denoising Networks for Regularizing Inverse Imaging Problems, Non-Local Deep Features for Salient Object Detection, Structure-Measure: A New Way to Evaluate Foreground Maps, Shallow Updates for Deep Reinforcement Learning, Wasserstein Generative Adversarial Networks, Variational Dropout Sparsifies Deep Neural Networks, Off-policy evaluation for slate recommendation, Attributes2Classname: A Discriminative Model for Attribute-Based Unsupervised Zero-Shot Learning, Benchmarking Denoising Algorithms With Real Photographs, Neural Aggregation Network for Video Face Recognition, Learned Contextual Feature Reweighting for Image Geo-Localization, Streaming Weak Submodularity: Interpreting Neural Networks on the Fly, CVAE-GAN: Fine-Grained Image Generation Through Asymmetric Training, VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation, Spherical convolutions and their application in molecular modelling, Convolutional Neural Network Architecture for Geometric Matching, Neural Face Editing With Intrinsic Image Disentangling, Realistic Dynamic Facial Textures From a Single Image Using GANs, Predictive State Recurrent Neural Networks, Deep TextSpotter: An End-To-End Trainable Scene Text Localization and Recognition Framework, ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events, Hunt For The Unique, Stable, Sparse And Fast Feature Learning On Graphs, Joint Learning of Object and Action Detectors, Asynchronous Stochastic Gradient Descent with Delay Compensation, Unrolled Memory Inner-Products: An Abstract GPU Operator for Efficient Vision-Related Computations, Maximizing Subset Accuracy with Recurrent Neural Networks in Multi-label Classification, Self-Organized Text Detection With Minimal Post-Processing via Border Learning, Coordinated Multi-Agent Imitation Learning, Gradient descent GAN optimization is locally stable, Removing Rain From Single Images via a Deep Detail Network, Convexified Convolutional Neural Networks, VegFru: A Domain-Specific Dataset for Fine-Grained Visual Categorization, Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin, Differential Angular Imaging for Material Recognition, A Multilayer-Based Framework for Online Background Subtraction With Freely Moving Cameras, Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation, Max-value Entropy Search for Efficient Bayesian Optimization, Higher-Order Integration of Hierarchical Convolutional Activations for Fine-Grained Visual Categorization, Generalized Deep Image to Image Regression, Adversarial Image Perturbation for Privacy Protection -- A Game Theory Perspective, Predicting Human Activities Using Stochastic Grammar, DESIRE: Distant Future Prediction in Dynamic Scenes With Interacting Agents, High-Order Attention Models for Visual Question Answering, f-GANs in an Information Geometric Nutshell, Revisiting IM2GPS in the Deep Learning Era, Attentional Correlation Filter Network for Adaptive Visual Tracking, Learning Cross-Modal Deep Representations for Robust Pedestrian Detection, Cognitive Mapping and Planning for Visual Navigation, Optimized Pre-Processing for Discrimination Prevention, Scalable Log Determinants for Gaussian Process Kernel Learning, A Hierarchical Approach for Generating Descriptive Image Paragraphs, Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization, Practical Data-Dependent Metric Compression with Provable Guarantees. [code and more] Y. Xu and W. Yin. Learn more. Papers with code has 8 repositories available. Fork 340. If nothing happens, download the GitHub extension for Visual Studio and try again. Follow on Twitter for updates Computer Vision. When a Good Translation is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion RC2020 Trends. GitHub is home to over 50 million developers working together. Install our new Chrome extension to get code suggestions when browsing in arxiv.org or Google Scholar. Anyone can contribute! Learn more. You can always update your selection by clicking Cookie Preferences at the bottom of the page. Modeling Dog Behavior From Visual Data, EC-Net: an Edge-aware Point set Consolidation Network, Learning a Discriminative Feature Network for Semantic Segmentation, Partial Transfer Learning With Selective Adversarial Networks, Cross-Modal Deep Variational Hand Pose Estimation, Between-Class Learning for Image Classification, AON: Towards Arbitrarily-Oriented Text Recognition, Learning Convolutional Networks for Content-Weighted Image Compression, Diversity Regularized Spatiotemporal Attention for Video-Based Person Re-Identification, Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries, CBMV: A Coalesced Bidirectional Matching Volume for Disparity Estimation, Deep Texture Manifold for Ground Terrain Recognition, Audio-Visual Event Localization in Unconstrained Videos, First Order Generative Adversarial Networks, Visual Coreference Resolution in Visual Dialog using Neural Module Networks, SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks, Deep Reinforcement Learning of Marked Temporal Point Processes, Explicit Inductive Bias for Transfer Learning with Convolutional Networks, LEGO: Learning Edge With Geometry All at Once by Watching Videos, Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes, Multi-Agent Diverse Generative Adversarial Networks, Face Aging With Identity-Preserved Conditional Generative Adversarial Networks, Learning to Separate Object Sounds by Watching Unlabeled Video, Exploiting the Potential of Standard Convolutional Autoencoders for Image Restoration by Evolutionary Search, Im2Flow: Motion Hallucination From Static Images for Action Recognition, ISTA-Net: Interpretable Optimization-Inspired Deep Network for Image Compressive Sensing, Hallucinated-IQA: No-Reference Image Quality Assessment via Adversarial Learning, CondenseNet: An Efficient DenseNet Using Learned Group Convolutions, HashGAN: Deep Learning to Hash With Pair Conditional Wasserstein GAN, Hierarchical Relational Networks for Group Activity Recognition and Retrieval, Collaborative and Adversarial Network for Unsupervised Domain Adaptation, Geometry-Aware Scene Text Detection With Instance Transformation Network, CSGNet: Neural Shape Parser for Constructive Solid Geometry, Local Spectral Graph Convolution for Point Set Feature Learning, GraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning, Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal, Fully-Convolutional Point Networks for Large-Scale Point Clouds, Learning Superpixels With Segmentation-Aware Affinity Loss, Zero-Shot Visual Recognition Using Semantics-Preserving Adversarial Embedding Networks, Crowd Counting With Deep Negative Correlation Learning, Dimensionality-Driven Learning with Noisy Labels, Deep Expander Networks: Efficient Deep Networks from Graph Theory, Low-Shot Learning With Large-Scale Diffusion, Cross-Domain Self-Supervised Multi-Task Feature Learning Using Synthetic Imagery, Learning Descriptor Networks for 3D Shape Synthesis and Analysis, Disentangling Factors of Variation with Cycle-Consistent Variational Auto-Encoders, CTAP: Complementary Temporal Action Proposal Generation, DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors, Conditional Image-Text Embedding Networks, EPINET: A Fully-Convolutional Neural Network Using Epipolar Geometry for Depth From Light Field Images, Glimpse Clouds: Human Activity Recognition From Unstructured Feature Points, Bayesian Optimization of Combinatorial Structures, FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis, Learning Type-Aware Embeddings for Fashion Compatibility, Sliced Wasserstein Distance for Learning Gaussian Mixture Models, Revisiting Deep Intrinsic Image Decompositions, A Spectral Approach to Gradient Estimation for Implicit Distributions, Hierarchical Novelty Detection for Visual Object Recognition, Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies, Learning Generative ConvNets via Multi-Grid Modeling and Sampling, Learning 3D Shape Completion From Laser Scan Data With Weak Supervision, Triplet Loss in Siamese Network for Object Tracking, Adversarial Attack on Graph Structured Data, Arbitrary Style Transfer With Deep Feature Reshuffle, Visual Question Reasoning on General Dependency Tree, Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition, Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks, Weakly-Supervised Action Segmentation With Iterative Soft Boundary Assignment, Recovering 3D Planes from a Single Image via Convolutional Neural Networks, SegStereo: Exploiting Semantic Information for Disparity Estimation, Functional Gradient Boosting based on Residual Network Perception, Generative Probabilistic Novelty Detection with Adversarial Autoencoders, Convolutional Sequence to Sequence Model for Human Dynamics, Joint Pose and Expression Modeling for Facial Expression Recognition, Grounding Referring Expressions in Images by Variational Context, Rethinking the Form of Latent States in Image Captioning, Open Set Domain Adaptation by Backpropagation, SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters, Deep Learning Under Privileged Information Using Heteroscedastic Dropout, Action Sets: Weakly Supervised Action Segmentation Without Ordering Constraints, Learning to Forecast and Refine Residual Motion for Image-to-Video Generation, Multi-Scale Weighted Nuclear Norm Image Restoration, Fine-Grained Visual Categorization using Meta-Learning Optimization with Sample Selection of Auxiliary Data, Assessing Generative Models via Precision and Recall, Towards Human-Machine Cooperation: Self-Supervised Sample Mining for Object Detection, Variational Autoencoders for Deforming 3D Mesh Models, Min-Entropy Latent Model for Weakly Supervised Object Detection, Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering, Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace, Learning a Discriminative Filter Bank Within a CNN for Fine-Grained Recognition, Finding Influential Training Samples for Gradient Boosted Decision Trees, Cross-View Image Synthesis Using Conditional GANs, Joint Optimization Framework for Learning With Noisy Labels, Future Person Localization in First-Person Videos, AutoLoc: Weakly-supervised Temporal Action Localization in Untrimmed Videos, Learning Transferable Architectures for Scalable Image Recognition, Mix and Match Networks: Encoder-Decoder Alignment for Zero-Pair Image Translation, Decouple Learning for Parameterized Image Operators, Generalized Earley Parser: Bridging Symbolic Grammars and Sequence Data for Future Prediction, Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models, AMNet: Memorability Estimation With Attention, Human Pose Estimation With Parsing Induced Learner, ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking, A Joint Sequence Fusion Model for Video Question Answering and Retrieval, Learning Face Age Progression: A Pyramid Architecture of GANs, Robust Physical-World Attacks on Deep Learning Visual Classification, High-Quality Prediction Intervals for Deep Learning: A Distribution-Free, Ensembled Approach, Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory, Multimodal Explanations: Justifying Decisions and Pointing to the Evidence, Accelerating Natural Gradient with Higher-Order Invariance, Hierarchical Multi-Label Classification Networks, Boosting Domain Adaptation by Discovering Latent Domains, Logo Synthesis and Manipulation With Clustered Generative Adversarial Networks, PacGAN: The power of two samples in generative adversarial networks, Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification, Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation, Salient Object Detection Driven by Fixation Prediction, Semantic Video Segmentation by Gated Recurrent Flow Propagation, Constraint-Aware Deep Neural Network Compression, Statistically-motivated Second-order Pooling, Analyzing Uncertainty in Neural Machine Translation, Learning Dynamics of Linear Denoising Autoencoders, Decoupled Parallel Backpropagation with Convergence Guarantee, Classification from Pairwise Similarity and Unlabeled Data, oi-VAE: Output Interpretable VAEs for Nonlinear Group Factor Analysis, Modeling Sparse Deviations for Compressed Sensing using Generative Models, Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction, Towards Open-Set Identity Preserving Face Synthesis, Five-Point Fundamental Matrix Estimation for Uncalibrated Cameras, BourGAN: Generative Networks with Metric Embeddings, Fast Information-theoretic Bayesian Optimisation, Deep Variational Reinforcement Learning for POMDPs, Specular-to-Diffuse Translation for Multi-View Reconstruction, Dynamic Conditional Networks for Few-Shot Learning, Learning Facial Action Units From Web Images With Scalable Weakly Supervised Clustering, High-Resolution Image Synthesis and Semantic Manipulation With Conditional GANs, Deep Defense: Training DNNs with Improved Adversarial Robustness, Learning K-way D-dimensional Discrete Codes for Compact Embedding Representations, Light Structure from Pin Motion: Simple and Accurate Point Light Calibration for Physics-based Modeling, Non-metric Similarity Graphs for Maximum Inner Product Search, Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation, Don’t Just Assume Look and Answer: Overcoming Priors for Visual Question Answering, Learning Dual Convolutional Neural Networks for Low-Level Vision, The Mirage of Action-Dependent Baselines in Reinforcement Learning, DVQA: Understanding Data Visualizations via Question Answering, Detecting and Correcting for Label Shift with Black Box Predictors, Conditional Prior Networks for Optical Flow, Generative Adversarial Learning Towards Fast Weakly Supervised Detection, Adversarial Learning with Local Coordinate Coding, Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks, AttnGAN: Fine-Grained Text to Image Generation With Attentional Generative Adversarial Networks, Learning to Explain: An Information-Theoretic Perspective on Model Interpretation, Gradually Updated Neural Networks for Large-Scale Image Recognition, Learning Steady-States of Iterative Algorithms over Graphs, Progressive Attention Guided Recurrent Network for Salient Object Detection, Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains, Unsupervised holistic image generation from key local patches, Inner Space Preserving Generative Pose Machine, Bilevel Programming for Hyperparameter Optimization and Meta-Learning, Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition, Breaking the Activation Function Bottleneck through Adaptive Parameterization, Ultra Large-Scale Feature Selection using Count-Sketches, Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks, Orthogonally Decoupled Variational Gaussian Processes, Batch Bayesian Optimization via Multi-objective Acquisition Ensemble for Automated Analog Circuit Design, A Modulation Module for Multi-task Learning with Applications in Image Retrieval, A Memory Network Approach for Story-Based Temporal Summarization of 360° Videos, Towards Effective Low-Bitwidth Convolutional Neural Networks, Disentangling Factors of Variation by Mixing Them, Weakly-supervised Video Summarization using Variational Encoder-Decoder and Web Prior, Learning Longer-term Dependencies in RNNs with Auxiliary Losses, Contour Knowledge Transfer for Salient Object Detection, HybridNet: Classification and Reconstruction Cooperation for Semi-Supervised Learning, Sidekick Policy Learning for Active Visual Exploration, Learning to Localize Sound Source in Visual Scenes, Diverse and Coherent Paragraph Generation from Images, DRACO: Byzantine-resilient Distributed Training via Redundant Gradients, Inter and Intra Topic Structure Learning with Word Embeddings, Estimating the Success of Unsupervised Image to Image Translation, Dynamic-Structured Semantic Propagation Network, The Description Length of Deep Learning models, Stereo Vision-based Semantic 3D Object and Ego-motion Tracking for Autonomous Driving, Blind Justice: Fairness with Encrypted Sensitive Attributes, Transfer Learning via Learning to Transfer, Deepcode: Feedback Codes via Deep Learning, A Framework for Evaluating 6-DOF Object Trackers, Differentially Private Database Release via Kernel Mean Embeddings, Recognizing Human Actions as the Evolution of Pose Estimation Maps, Connecting Pixels to Privacy and Utility: Automatic Redaction of Private Information in Images, DeLS-3D: Deep Localization and Segmentation With a 3D Semantic Map, Geolocation Estimation of Photos using a Hierarchical Model and Scene Classification, Diverse Conditional Image Generation by Stochastic Regression with Latent Drop-Out Codes, Inference Suboptimality in Variational Autoencoders, Feedback-Prop: Convolutional Neural Network Inference Under Partial Evidence, Quadrature-based features for kernel approximation, Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking, Single Image Water Hazard Detection using FCN with Reflection Attention Units, Multimodal Generative Models for Scalable Weakly-Supervised Learning, Importance Weighted Transfer of Samples in Reinforcement Learning, Feature Generating Networks for Zero-Shot Learning, DICOD: Distributed Convolutional Coordinate Descent for Convolutional Sparse Coding, CapProNet: Deep Feature Learning via Orthogonal Projections onto Capsule Subspaces, Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages, A Hybrid l1-l0 Layer Decomposition Model for Tone Mapping, Spatially-Adaptive Filter Units for Deep Neural Networks, Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives, Lifelong Learning via Progressive Distillation and Retrospection, CLEAR: Cumulative LEARning for One-Shot One-Class Image Recognition, Not to Cry Wolf: Distantly Supervised Multitask Learning in Critical Care, Learning Answer Embeddings for Visual Question Answering, Information Constraints on Auto-Encoding Variational Bayes, Parallel Bayesian Network Structure Learning, Ring Loss: Convex Feature Normalization for Face Recognition, Teaching Categories to Human Learners With Visual Explanations, Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization, Convergent Tree Backup and Retrace with Function Approximation, Gaze Prediction in Dynamic 360° Immersive Videos, Statistical Recurrent Models on Manifold valued Data, End-to-End Flow Correlation Tracking With Spatial-Temporal Attention, Bridging the Gap Between Value and Policy Based Reinforcement Learning, REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models, LightGBM: A Highly Efficient Gradient Boosting Decision Tree, Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation, Large Pose 3D Face Reconstruction From a Single Image via Direct Volumetric CNN Regression, A Unified Approach to Interpreting Model Predictions, ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games, PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation, Fully Convolutional Instance-Aware Semantic Segmentation, Aggregated Residual Transformations for Deep Neural Networks, Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, Unsupervised Image-to-Image Translation Networks, Photographic Image Synthesis With Cascaded Refinement Networks, High-Resolution Image Inpainting Using Multi-Scale Neural Patch Synthesis, SphereFace: Deep Hypersphere Embedding for Face Recognition, Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes, Toward Multimodal Image-to-Image Translation, Learning to Discover Cross-Domain Relations with Generative Adversarial Networks, PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space, Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, FlowNet 2.0: Evolution of Optical Flow Estimation With Deep Networks, Channel Pruning for Accelerating Very Deep Neural Networks, Inferring and Executing Programs for Visual Reasoning, DSOD: Learning Deeply Supervised Object Detectors From Scratch, Arbitrary Style Transfer in Real-Time With Adaptive Instance Normalization, Accelerating Eulerian Fluid Simulation With Convolutional Networks, Learning Disentangled Representations with Semi-Supervised Deep Generative Models, Inductive Representation Learning on Large Graphs, Regressing Robust and Discriminative 3D Morphable Models With a Very Deep Neural Network, How Far Are We From Solving the 2D & 3D Face Alignment Problem?