Quanvolutional Neural Network
Quanvolutional neural networks: powering image recognition with quantum circuits
pages
1–9(2020)Cite this article
Abstract
Convolutional neural networks (CNNs) have rapidly risen in popularity for many machine learning applications, particularly in the field of image recognition. Much of the benefit generated from these networks comes from their ability to extract features from the data in a hierarchical manner. These features are extracted using various transformational layers, notably the convolutional layer which gives the model its name. In this work, we introduce a new type of transformational layer called a quantum convolution, or quanvolutional layer. Quanvolutional layers operate on input data by locally transforming the data using a number of random quantum circuits, in a way that is similar to the transformations performed by random convolutional filter layers. Provided these quantum transformations produce meaningful features for classification purposes, then this algorithm could be of practical use for near-term quantum computers as it requires small quantum circuits with little to no error correction. In this work, we empirically evaluated the potential benefit of these quantum transformations by comparing three types of models built on the MNIST dataset: CNNs, quantum convolutional neural networks (QNNs), and CNNs with additional non-linearities introduced. Our results showed that the QNN models had both higher test set accuracy as well as faster training compared with the purely classical CNNs.
Introduction
The field of quantum machine learning (QML) has experienced rapid growth over the past few years, as evidenced by the rapid increase in impactful QML papers (Jordan 2011). Several excellent papers (Dunjko and Briegel 2018; Ciliberto et al. 2018; Dunjko et al. 2016; Perdomo-Ortiz et al. 2017; Biamonte et al. 2018) encapsulate the current state of the QML field. Machine learning algorithms tend to give probabilistic results and contain correlated components but at the same time suffer computational bottlenecks due to the curse of dimensionality. Similarly, quantum computers by their very nature provide probabilistic results upon measurement and are formed from intrinsically coupled quantum systems, which can provide potentially exponential speedups due to their ability to perform massively parallel computations on the superposition of quantum states. While quantum computers are by no means expected to replace classical computing, they have the potential to be powerful components in an overall machine learning application pipeline.
Our research focuses on a novel quantum algorithm which falls squarely into the regime of “hybrid classical-quantum” algorithms, extending the classical algorithm of convolutional neural networks (CNNs). In the few short years since their inception by LeCun et al. (1998), CNNs have become the standard for many machine learning applications. Although the emerging capsule networks (Sabour et al. 2017) have shown promise for pushing the bounds of machine learning performance even further, various flavors of CNNs have held the accuracy records on many benchmark image recognition problems for years, including MNIST, CIFAR, and SVHN.
CNNs operate as a “stack” of transformations that are applied to input data. These transformations are used to extract useful features in the data which can be leveraged for classification purposes. The convolutional layers in a CNN stack are each composed of N convolutional filters. Each filter in a convolutional layer iteratively convolves local subsections of the full input to produce feature maps, and the output of the convolutional layer will be a tensor of N feature maps which each contain information about different spatially local patterns in the data. Each layer of the CNN repeats this process on the output of the layer preceding it, resulting in increasingly abstract features. These abstract features are useful because classifiers built on top of these transformations (or more likely a series of transformations through an entire network stack) produce far more accurate results than classifiers built directly on top of the input data itself.
In this work, we investigate a new type of model which we will call quanvolutional neural networks (QNNs). QNNs extend the capabilities of CNNs by leveraging certain powerful aspects of quantum computation. QNNs add a new type of transformational layer to the standard CNN architecture: the quantum convolutional (or quanvolutional) layer. Quanvolutional layers are made up of a group of N quantum filters which operate much like their classical convolutional layer counterparts, producing feature maps by locally transforming input data. The key difference is that quanvolutional filters extract features from input data by transforming spatially local subsections of data using quantum circuits. We hypothesize that features produced by quanvolutional layers could increase the accuracy of machine learning models for classification purposes. If this hypothesis holds true, then QNNs will be a powerful application for near-term quantum computers, or noisy (not error corrected), intermediate-scale (50 - 100 qubits) quantum (NISQ) computers (Preskill 2018). This is thanks to three reasons:
- 1.
- Quanvolutional filters are applied to only local subsections of input data, so they can operate using a small number of quantum bits (qubits) with shallow gate depths
- 2.
- Quanvolutions are resilient to error; as long as the error model in the quantum circuit is consistent, it can essentially be thought of as another component in the random quantum circuit
- 3.
- Since determining the output from random quantum circuits is not possible to simulate classically at scale, they would require quantum devices for efficient computation
Architectural design of QNNs
Design motivations
This section serves as a motivational preface to the use of quanvolutional transformations within a broader machine learning framework. First, leveraging random non-linear features is well-known to be useful within many machine learning algorithms for increasing accuracy or decreasing training times, such as in CNNs using random convolutions and echo state networks (Jaeger and Haas 2004; Ranzato et al. 2007). Second, as pointed out in Mitarai et al. (2018), quantum circuits are able to model complex functional relationships, such as universal quantum cellular automata, which is infeasible using polynomial-sized classical computational resources. The idea of merging these observations together—leveraging some form of non-linear quantum circuit transformations for machine learning purposes—has recently emerged in the QML field. We may consider briefly the current state of the QML field with algorithms in this space, as well as the potential “quantum advantage” motivations for such algorithms.
Several quantum variations of classical models have been recently developed, including quantum reservoir computing (QRC) (Fujii and Nakajima 2016), quantum circuit learning (QCL) (Mitarai et al. 2018), continuous-variable quantum neural networks (Killoran et al. 2018), quantum kitchen sinks (QKS) (Wilson et al. 2018), quantum variational classifiers, and quantum kernel estimators (Havlí?ek et al. 2019). The QNN approach similarly aims to use the novelty of quantum circuit transformations within a machine learning framework, while differing from previous works in (a) the particular methodology around processing classical information into and out of the different quantum circuits (more details in Section 2.3) and (b) the flexible integration of such computations into state-of-the-art deep neural network machine learning models. Another key difference in the QNN approach revolves around the lack of variational tuning of the quantum components of the model. Several different frameworks have been presented for the use of variational methods for training the quantum components of such quantum feature transformation methods (Bergholm et al. 2018; Crooks 2018; Schuld et al. 2018). This work however builds off of the static feature methodology, in some sense extending the work of the QRC methodology.
In terms of a quantum advantage or quantum supremacy consideration of this model, this paper makes an argument similar to that of Havlí?ek et al. (2019). In essence, QNNs can efficiently access kernel functions in high-dimensional Hilbert spaces, which if useful for machine learning purposes could provide a pathway to quantum advantage.
Quanvolutional network design
As laid out briefly in Section 1, QNNs are simply an extension of classical CNNs, with an additional transformational layer, called the quanvolutional layer. Quanvolutional layers integrate into the machine learning stack in exactly the same way as classical convolutional layers, allowing the user to:
- Define any arbitrary integer value for the number of quanvolutional filters in a quanvolutional layer
- Stack any number of new quanvolutional layers on top of any other layer in the network stack
- Provide layer-specific configuration attributes (encoding and decoding methods, average number of quantum gates per qubit in the quantum circuit, etc.)
Satisfying these conditions, the quanvolutional filter should be very generalizable and just as easy to implement in any architecture as its classical predecessor. The number of such layers, the order in which they are implemented, and the particular parameters of each are entirely left up to the end user’s specifications. The generality of QNNs is visualized in Fig. 1. Figure 1a shows just one example QNN realization, where the first layer of the stack has a quanvolutional layer with three quanvolutional filters, followed by a pooling layer, a convolutional layer with six filters, a second pooling layer, and two final fully connected (FC) layers, wherein the final FC layer represents the target variable output. The diagram conveys the generality and flexibility for architects to change, remove, or add layers as desired. The overall network architecture of Fig. 1a would have the exact same structure if we replaced the quanvolutional layer with a convolutional layer of three filters, or similarly if we replaced the convolution layer with a quanvolutional layer with six quanvolutional filters. The difference between the quanvolutional and convolutional layer depends on the way that the quanvolutional filters of Fig. 1b perform calculations, which is laid out in detail in Section 2.3.
Fig. 1
a Simple example of a quanvolutional layer in a full network stack. The quanvolutional layer contains several quanvolutional filters (three in this example) that transform the input data into different output feature maps. b An in-depth look at the processing of classical data into and out of the random quantum circuit in the quanvolutional filter
Quanvolutional filter design
Quanvolutional filters each produce a feature map when applied to an input tensor, by transforming spatially local subsections of the input tensor using the quanvolutional filter. However, unlike the simple element-wise matrix multiplication operation that a classical convolutional filter applies, a quanvolutional filters transforms input data using a quantum circuit, which can be structured or random. For simplicity and to establish a baseline, in this work, we use randomly generated quantum circuits for the quanvolutional filters as opposed to circuits with a designed structure.
At a high level, quanvolutional filters transform an input tensor into an output scalar using the 2D matrix of scalars output by a universal quantum computing (UQC) circuit in the set BQP (Bernstein and Vazirani 1997). We can formalize this process for transforming classical data using quanvolutional filters as follows:
- 1.
- Let us consider a single quanvolutional filter. This quanvolutional filter uses a random quantum circuit q, which takes as input spatially local subsections of images from dataset u. For our purposes, we define each of these inputs as ux, and each ux will be a 2D matrix of size n-by-n wherein n > 1.
- 2.
- Although there are many ways of encoding ux as an initialized state of q, for each quanvolutional filter, we choose one particular encoding function e; we define the encoded initialization state ix as ix = e(ux).
- 3.
- After the quantum circuit is applied to the initialized state ix, the result of the quantum computation will be an output quantum state ox, with the relationship ox = q(ix) = q(e(ux)).
- 4.
- Although there are many ways of decoding the information made available about ox through a finite number of measurements, to ensure that the quanvolutional filter output is consistent with similar output from a standard classical convolution, we define the final decoded state as fx = d(ox) = d(q(e(ux))) wherein d is our decoding function and fx is a scalar value.
- 5.
- Let us define the total transformation of d(q(e(ux))) from this point on as the “quanvolutional filter transformation” Q of ux, aka fx = Q(ux,e,q,d). In Fig. 1b, a visualization of a single quanvolutional filter is shown, displaying the encoding/applied circuit/decoding process.
- 6.
- If we consider the number of calculations that occur when applying a classical convolution filter to input from dataset u, the number of computations required is simply (??
- 2
- )
- O(n2), placing the computational complexity squarely in P. This is not the case for the computational complexity of Q, which is #P-hard (Huang et al. 2018); this emerges specifically out of the complexity of the random quantum circuit transformation q, while e and d can be performed efficiently on classical devices.
Our experimental argument is the following: let us consider a theoretical situation in which an end user has access to many quanvolutional filters within a quanvolutional layer, with the goal of building a model on dataset u for classification purposes (i.e., generate a model that can properly label an input image with a correct output label). In this example, suppose that we build two types of networks: (1) networks built using solely classical convolutional layers and (2) networks using at least one quanvolutional layer. If the networks of type 1 result in better performance results compared with type 2, a reasonable insight to draw from this outcome is that the quantum features extracted from the training dataset u were not ultimately useful in building abstract features for classification purposes. However, if networks of type 2 routinely and significantly outperformed networks of type 1, we could again reasonably draw an insight that the quantum features produced were useful in building features on dataset u for classification purposes. Additionally, this could signal a true quantum advantage, as the classical generation of the quantum features would fall squarely within BQP.
Strengths and limitations of quanvolutional approaches
While the potential benefits of quanvolutional filters were expanded on in Section 2.3 compared with classical, there are several benefits compared with many other quantum computing algorithms that make the QNN approach ideal for the NISQ computing era.
- Inherently hybrid algorithm. All NISQ algorithms of interest will be hybrid by nature, and the QNN framework embraces this whole-heartedly. The QNN stack purposely integrates elements from both classical data and algorithms (CNNs) with quantum subprocesses (quanvolutional layers). While inherently hybrid classical-quantum approaches will likely fall short of some of the lofty goals of quantum computing, such as exponential speedups over classical calculations, we contend that the NISQ field will show a “crawl-walk-run” evolution, where modest (polynomial) speedups and improvements will occur before larger (exponential) ones are discovered.
- No QRAM requirements. As pointed out in Aaronson (2015), a major hurdle for QML speedups is the lack of an efficient way of loading large classical data into quantum random access memory (QRAM) for future operations. However, since the QNN framework approach simply requires running many quantum circuits on single data points, there is no need to store entire datasets into QRAM.
- Potential resiliency to unknown but consistent error models. Since quanvolutional layer feature maps result from transforming data through random quantum circuits, it is reasonable to assume that adding in error models does not necessarily invalidate the overall algorithm. Conceptually, many forms of quantum error can be thought of as unknown, and unwanted, gate operations. For example, the user attempts to run a specific quantum circuit, and due to hardware imperfections, some hidden “noisy” quantum gates are also added in the circuit, resulting in a different final quantum state than desired. Since QNNs use the quantum circuits as feature detectors, and it is not clear a priori which quantum circuits lead to the most useful features, adding in some unknown “noise” gates does not necessarily impact feature detection quality overall. We consider that modeling unstable, noisy error effects of NISQ devices in quanvolutional filters is an open question; hence, assessing how well the QNN approach performs in the presence of such error is outside the scope of this paper.
While these points are encouraging for NISQ-era viability, one must also be aware of the limitations, constraints, and open questions of the overall QNN approach.
- Optimal interface with classical data. As described in Section 2.3, two key components of the quantum filter design are the encoding and decoding of the classical information and how they interface with the quantum system. While the encoding and decoding protocols used in this experiment will be discussed in more detail in Section 3.5, it remains an open question about how to optimally design these protocols. Further, even if some protocols were determined to be useful, some may be impractical in experimental use. For instance, decoding methods that require more than one measurement of the quantum system quickly become problematic; referencing again (Aaronson 2015), any potential “quantum speedup” disappears for algorithms that require large numbers of quantum measurements. On the other hand, a very different avenue to explore is the utility of weak measurements, which could simplify the decoding protocol at the expense of obtaining less information about the final quantum state. Ideally, encoding and decoding methods would both (1) produce beneficial machine learning features and (2) require minimal measurements and processing. Future work would focus on probing the effects of different encoding (single qubit mappings, parameterized rotations, etc.) and decoding protocols (i.e., full state estimation and weak measurements) to determine how these selections affect QNN performance on different datasets.
- Large number of transformations. Although the quanvolutional filter designs have the advantage that QRAM is not required, they can require a large number of quantum circuit executions. This becomes readily apparent if one considers the number of computations that are required in a classical convolutional layer. For a convolutional layer with 50 convolutional filters using zero-padding and a stride of 1, on an input 100-by-100-by-1 pixel image, 50 × 100 × 100 = 500,000 element-wise matrix multiplication operations will need to be applied on this image alone. Since the quanvolutional layer was designed to function in the same way as a convolutional layer, this means a quanvolutional layer with 50 quanvolutional filters operating on the same input data would require the same 500,000 quanvolutional filter executions on each image as well. This fundamentally poses a challenge, as the runtime difference between training a QNN vs a CNN will depend in part on the difference between running individual convolutional vs quanvolutional filter executions. While both operations are embarrassingly parallelizable, in the NISQ-era quanvolutional executions would likely be orders of magnitude slower than CNN operations, which have been optimized on GPUs and do not encounter the considerable I/O bottleneck currently inherent to classical-quantum systems. Strategies to work around this limitation are outlined in Section 3.5.
- Clear demonstration of quantum advantage. Experimentally verifying a near-term algorithm which shows a quantum computing advantage over classical methods continues to be highly elusive. While some groups continue to work towards experiments that could definitively show some form of quantum supremacy (Boixo et al. 2016), the QNN algorithm laid out in this work aims at providing a general framework and strategy for future such endeavors. However, ultimately the burden of proof is on the QNN algorithm to show why such an approach provides any benefit over other classical transformations. Showing that the QNN approach can be useful in a machine learning context is a good starting point, but true adoption of such quantum methods should only come after a clear benefit is displayed. These comparisons will be investigated in detail in Section 4.
Experimental design
Our experiments in this work were designed to highlight the novelties introduced by the QNN algorithm: the generalizability of quanvolutional layers inside a typical CNN architecture, the ability to use this quantum algorithm on practical datasets, and the potential use of features introduced by the quanvolutional transformations. There has been an increase in research into the use of quantum circuits in machine learning applications lately, such as in Wilson et al. (2018), where classical data is processed through randomly parameterized quantum circuits and the output is then used to train linear models. These models built on the quantum transformations were shown to be beneficial compared with other linear models built directly on the dataset itself, but did not have the same level of performance compared with other classical models (such as SVMs). The experiments in this work build off these results by integrating quantum feature detection into a more complex neural network architecture, as the QNN framework naturally introduces classical models that already contain non-linearities. In Section 3.2, we clearly specify the comparisons between quantum and classical approaches.
Classical dataset
The image benchmark MNIST dataset (LeCun et al. 1998) was used in this work, which contains 70,000 (60,000 training and 10,000 test) 28-by-28 greyscale pixel images. While several QML papers used a subset of the full dataset (Wilson et al. 2018) or reduced the dimensionality of the dataset to fit onto hardware (Adachi and Henderson 2015; Benedetti et al. 2017), the spatially local transformational nature of the quanvolutional approach allows this framework to be applied to large, high-dimensional datasets.
Tested models
In this research, we tested three separate models:
- 1.
- CNN MODEL. A purely classical convolutional neural network, with the following network structure: CONV1 - POOL1 - CONV2 - POOL2 - FC1 - FC2. Each convolutional layer used ReLU and filters of size 5-by-5, and the first and second convolutional layers had 50 and 64 filters, respectively. The first fully connected layer had 1024 hidden units and a dropout layer of 0.4, and the second fully connected layer is the output layer, with 10 hidden units (1 for each target variable label).
- 2.
- QNN MODEL. The most basic quanvolutional neural network: a CNN network with a single quanvolutional layer. Specifically, the single quanvolutional layer is the first transformation in the stack, and then the remaining architecture on top is the exact same as the CNN MODEL: QUANV1 - CONV1 - POOL1 - CONV2 - POOL2 - FC1 - FC2.Footnote
- 1 The number of filters in the quanvolutional layer swept over a range from 1 to 50 to determine the effect of the number of filters on model performance.
- 3.
- RANDOM MODEL. A similar network to QNN MODEL architecture, except instead of the first transformation being a quanvolutional transformation, a purely classical random non-linear transformation was applied instead.
By comparing the QNN MODEL network with both the CNN MODEL and the RANDOM MODEL, we can address whether or not adding quantum features improves in any way the overall CNN model performance, and investigate the QNN performance against a classical non-linear approach. Each model was trained for 10,000 iterations and at each 100 training steps, the current log-loss and test set accuracy results of the model were saved.
Experimental environment
The experiments performed in this research were conducted on the QxBranch Quantum Computer Simulation System. This experimental environment represents a universal quantum computer capable of executing gate-model instructions of arbitrary gate width, circuit depth, and fidelity. No noise models were used in the experiments to assess the effectiveness of the ideal universal quantum computational model versus the classical computational model. Future experimentation could include the effects of noise, or use of NISQ hardware such as provided by Google, IBM, and Rigetti Computing.
Quanvolutional filter generation methodology
To generate each quantum filter, we require the input size of the quanvolutional filters, which defines how many qubits are required for the circuit.Footnote
2 In this work, we chose the simplest possible implementation to use only 3-by-3 quanvolutional filters, so that each simulated circuit had exactly 9 qubits (n = 3). We then generated the actual circuit by treating each qubit as a node in a graph, and assigning a “connection probability” between each qubit. This probability is the likelihood that a 2-qubit gate will be applied between the two qubits in question, using either a randomly selected CNot, Swap, SqrtSwap, or ControlledU gate. Additionally, a random number (in the range [0,2n2]) of 1-qubit gates (coming from the gate set [X(??),Y (??),Z(??),U((??),P,T,H], where ?? is a random rotational parameter) were generated, with the target qubit for each gate chosen at random. After all 1 and 2-qubit gates were generated, the order of these gate operations in the circuit is shuffled. This final ordering of gate operations became one quanvolutional filter in the quanvolutional layer.
Encoding and decoding methodology
As mentioned in Section 2.4, the optimal method for interfacing between the classical and quantum components of the QNN algorithm is currently an open question. In our approach, the gate operations in our quantum circuits were kept static and the information was encoded into the initial states of each individual qubit, as shown in Fig. 1b. To experiment with the simplest form of encoding, we applied a threshold value to each pixel, and values greater than this threshold (in this case 0) were encoded in the |1〉 state, while those equal to or below were encoded in the |0〉 state. In terms of decoding, as shown in Fig. 1b, to enforce that the quanvolutional network functions in a similar way to a convolutional filter the output decoding is condensed to a scalar output value. Taking advantage of the QCSS’ ability to output the whole state vector, we select the most likely output qubit vector state and sum the number of qubits that were measured in the |1〉 state. This drastically reduces the total input state-space, and made it possible to, as a pre-processing step, fully determine all possible input-output mappings for any input data. In essence, a look-up table was being applied on each new input set of spatially local data, rather than re-running the data through a quantum circuit each time. While this exact approach is infeasible on hardware since the full state-space will never be known, we used this method to reduce the number of costly classical simulations of quantum circuits required in our experiments. In future work, a more practical, similar approach using a finite number of measurements could be implemented, taking the most commonly measured result.
Results
Before comparing the overall QNN algorithm with classical performance, we first test that the overall algorithm is performing as expected. We ran several different QNN MODELs with a varying number of quanvolutional filters and analyze the test set accuracy as a function of training iterations. These results are shown in Fig. 2 and validate two important aspects of the overall QNN algorithm. First, the QNN algorithm functioned as expected within the larger framework; adding the quanvolutional layer to the overall network stack generated the high accuracy results (95% or higher) expected from a deep neural network. Secondly, the quanvolutional layer and network seemed to behave as expected compared with using a similar classical convolutional layer. The more training iterations, the higher the model accuracy became. Additionally, adding more quanvolutional filters increased model performance, consistent with adding more classical filters to a convolutional network. This observation reaches convergence as expected; just as a classical convolutional layer reaches a “saturation” effect after a certain number of filters, a similar convergence was observed for the number of quanvolutional filters. While the network accuracy increased radically from a single filter to 5, and similarly from 5 to 10, there was minimal advantage in using 50 vs 25 quanvolutional filters in this experiment.
Fig. 2
QNN MODEL test set accuracy results using a variable number of quanvolutional filters
Having validated that the overall quanvolutional layer performed as intended, the final test was to determine how these QNN MODELs compared with both CNN MODEL and RANDOM MODEL. In our experiment, both the QNN MODEL and the RANDOM MODEL had 25 transformations in the quanvolutional and random, non-linear transformational layer, respectively. The results comparing these three models are shown in Fig. 3.
Fig. 3
QNN MODEL performance, in terms of a test set accuracy and b training log-loss, compared with both CNN MODEL and RANDOM MODEL
The results of Fig. 3 exhibit a consistency with those of Wilson et al. (2018), as well as showing some results that extend the scope of the applicability of quantum features (features produced by processing classical data through quantum transformations) even further. The work of Wilson et al. (2018) showed that quantum transformations feeding into a linear model could give a performance enhancement over linear models built on the data directly. In a similar way, the performance boost seen in Fig. 3 makes a strong argument that adding in quantum features into a more complex non-linear model stack can lead to a similar performance benefit compared with the same non-linear model stack built directly on the data itself. However, while this is a promising step for some form of QNN applicability, it still does not show any clear quantum advantage over all classical models. The results of RANDOM MODEL were statistically indistinguishable from the QNN MODEL results; this would imply that the non-linear transformations of the random quantum circuits were not in any significant way advantageous (or disadvantageous) compared with classical random non-linear transformations.
Conclusions
QNNs could provide early quantum adopters a useful, flexible, and scalable quantum machine learning application for real-world problems. The QNN experimental results showed that even within a larger, typical deep neural network architecture stack, quanvolutional transformations to classical data can increase accuracy in the network; however, this research did not definitively show any quantum advantage over other classical, non-linear transformations.
While outside the scope of this work, these results imply the next challenge should be to determine what properties of quanvolutional filters, or which specific quanvolutional filters, are both (1) useful for machine learning purposes and (2) classically difficult to simulate. This presents interesting challenges and research questions. Are there structured quanvolutional filters that seem to always provide an advantage over others? How data-dependent is the ideal “set” of quanvolutional filters? How much do encoding and decoding approaches influence overall performance? What are the minimal quanvolutional filter gate depths that lead to some kind of advantage? How does the addition of multiple quanvolutional layers to different parts of a QNN effect the performance? Is there a similar pattern compared with CNNs where using larger quanvolutional filters lower in the stack and smaller filters higher in the stack leads to better results? Attempting to find answers for these (and similar) questions will be necessary steps towards any path towards a viable use case of QNNs.
Notes
- 1.
- The QNN topology chosen in this work is not fixed by nature. As mentioned in Section 2.2, the QNN framework was designed to give users complete control over the number and order of quanvolutional layers in the architecture. The topology explored in this work was chosen because it was the simplest QNN architecture to use as a baseline for comparison against other purely classical networks. Future work would focus on exploring the impact of more complex architectural variations.
- 2.
- It is not a hard requirement in general that the number of qubits in the quanvolutional filter matches the dimensionality of the input data. In many quantum circuit applications, ancilla qubits are required to perform elements of an overall computation. Such transformations using ancilla qubits, however, are outside the scope of this work and will be an interesting future topic to explore.
References
- Aaronson S (2015) Read the fine print. Nature Physics 11:291. https://www.nature.com/articles/nphys3272
- Adachi SH, Henderson MP (2015) arXiv:1510.06356
- Benedetti M, Realpe-Gómez J, Biswas R, Perdomo-Ortiz A (2017) Quantum-assisted learning of hardware-embedded probabilistic graphical models. Phys Rev X 7(4):41052. https://doi.org/10.1103/PhysRevX.7.041052
- Bergholm V, Izaac J, Schuld M, Gogolin C, Blank C, McKiernan K, Killoran N (2018) arXiv:1811.04968
- Bernstein E, Vazirani U (1997) Quantum complexity theory. SIAM J Comput 26(5):1411. https://doi.org/10.1137/S0097539796300921
- Biamonte J, Wittek P, Pancotti N, Rebentrost P, Wiebe N, Lloyd S (2018) Quantum machine learning. Nature 549(7671):195. https://www.nature.com/articles/nature23474
- Boixo S, Isakov SV, Smelyanskiy VN, Babbush R, Ding N, Jiang Z, Bremner MJ, Martinis JM, Neven H (2016) https://doi.org/10.1038/s41567-018-0124-x. arXiv:1608.00263
- Ciliberto C, Herbster M, Ialongo AD, Pontil M, Rocchetto A, Severini S, Wossnig L (2018) Quantum machine learning: a classical perspective. Proc R Soc A 474(2209):20170551. https://royalsocietypublishing.org/doi/10.1098/rspa.2017.0551
- Crooks GE (2018) QuantumFlow: a quantum algorithms development toolkit. https://quantumflow.readthedocs.io/en/latest/
- Dunjko V, Briegel HJ (2018) Machine learning & artificial intelligence in the quantum domain: a review of recent progress. Reports on Progress in Physics 81(7):074001. https://doi.org/10.1088/1361-6633/aab406. https://stacks.iop.org/0034-4885/81/i=7/a=074001?key=crossref.484b39e1cdde454de1cbc5aba8d6de34
- Dunjko V, Taylor JM, Briegel HJ (2016) Quantum-enhanced machine learning. Phys Rev Lett 117 (13):130501. https://doi.org/10.1103/PhysRevLett.117.130501
- Fujii K, Nakajima K (2016) https://doi.org/10.1103/physrevapplied.8.024030 arXiv:1602.08159
- Havlí?ek V, Córcoles AD, Temme K, Harrow AW, Kandala A, Chow JM, Gambetta JM (2019) Supervised learning with quantum-enhanced feature spaces. Nature 567(7747):209. https://doi.org/10.1038/s41586-019-0980-2
- Huang C, Newman M, Szegedy M (2018) arXiv:1804.10368
- Jaeger H, Haas H (2004) Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304(5667):78. https://science.sciencemag.org/content/304/5667/78.abstract
- Jordan S (2011) Quantum algorithm zoo. https://math.nist.gov/quantum/zoo/ https://math.nist.gov/quantum/zoo/
- Killoran N, Bromley TR, Arrazola JM, Schuld M, Quesada N, Lloyd S (2018) arXiv:1806.06871
- LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11):2278. https://doi.org/10.1109/5.726791. https://ieeexplore.ieee.org/document/726791
- Mitarai K, Negoro M, Kitagawa M, Fujii K (2018) https://doi.org/10.1103/PhysRevA.98.032309. arXiv:1803.00745
- Perdomo-Ortiz A, Benedetti M, Realpe-Gómez J, Biswas R (2017) Opportunities and challenges for quantum-assisted machine learning in near-term quantum computers. Quantum Science and Technology 3(3):030502. arXiv:1708.09757v2. https://iopscience.iop.org/article/10.1088/2058-9565/aab859/meta
- Preskill J (2018) https://doi.org/10.22331/q-2018-08-06-79. arXiv:1801.00862
- Ranzato M, Huang FJ, Boureau Y, LeCun Y (2007) Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: 2007 IEEE conference on computer vision and pattern recognition, pp 1–8. https://ieeexplore.ieee.org/document/4270182, DOI https://doi.org/10.1109/CVPR.2007.383157
- Sabour S, Frosst N, Hinton GE (2017) arXiv:1710.09829
- Schuld M, Bergholm V, Gogolin C, Izaac J, Killoran N (2018) https://doi.org/10.1103/physreva.99.032331. arXiv:1811.11184
- Wilson CM, Otterbach JS, Tezak N, Smith RS, Crooks GE, da Silva MP (2018) arXiv:1806.08321
Worldwide AI Products Lead EY Client Technology Platform Fabric
4 年Thanks for the share, Sam.