DESCRIPTION.rst 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651
  1. # Voluptuous is a Python data validation library
  2. [![Build Status](https://travis-ci.org/alecthomas/voluptuous.png)](https://travis-ci.org/alecthomas/voluptuous)
  3. [![Coverage Status](https://coveralls.io/repos/github/alecthomas/voluptuous/badge.svg?branch=master)](https://coveralls.io/github/alecthomas/voluptuous?branch=master) [![Gitter chat](https://badges.gitter.im/alecthomas.png)](https://gitter.im/alecthomas/Lobby)
  4. Voluptuous, *despite* the name, is a Python data validation library. It
  5. is primarily intended for validating data coming into Python as JSON,
  6. YAML, etc.
  7. It has three goals:
  8. 1. Simplicity.
  9. 2. Support for complex data structures.
  10. 3. Provide useful error messages.
  11. ## Contact
  12. Voluptuous now has a mailing list! Send a mail to
  13. [<voluptuous@librelist.com>](mailto:voluptuous@librelist.com) to subscribe. Instructions
  14. will follow.
  15. You can also contact me directly via [email](mailto:alec@swapoff.org) or
  16. [Twitter](https://twitter.com/alecthomas).
  17. To file a bug, create a [new issue](https://github.com/alecthomas/voluptuous/issues/new) on GitHub with a short example of how to replicate the issue.
  18. ## Documentation
  19. The documentation is provided [here] (http://alecthomas.github.io/voluptuous/).
  20. ## Changelog
  21. See [CHANGELOG.md](CHANGELOG.md).
  22. ## Show me an example
  23. Twitter's [user search API](https://dev.twitter.com/rest/reference/get/users/search) accepts
  24. query URLs like:
  25. ```
  26. $ curl 'http://api.twitter.com/1.1/users/search.json?q=python&per_page=20&page=1'
  27. ```
  28. To validate this we might use a schema like:
  29. ```pycon
  30. >>> from voluptuous import Schema
  31. >>> schema = Schema({
  32. ... 'q': str,
  33. ... 'per_page': int,
  34. ... 'page': int,
  35. ... })
  36. ```
  37. This schema very succinctly and roughly describes the data required by
  38. the API, and will work fine. But it has a few problems. Firstly, it
  39. doesn't fully express the constraints of the API. According to the API,
  40. `per_page` should be restricted to at most 20, defaulting to 5, for
  41. example. To describe the semantics of the API more accurately, our
  42. schema will need to be more thoroughly defined:
  43. ```pycon
  44. >>> from voluptuous import Required, All, Length, Range
  45. >>> schema = Schema({
  46. ... Required('q'): All(str, Length(min=1)),
  47. ... Required('per_page', default=5): All(int, Range(min=1, max=20)),
  48. ... 'page': All(int, Range(min=0)),
  49. ... })
  50. ```
  51. This schema fully enforces the interface defined in Twitter's
  52. documentation, and goes a little further for completeness.
  53. "q" is required:
  54. ```pycon
  55. >>> from voluptuous import MultipleInvalid, Invalid
  56. >>> try:
  57. ... schema({})
  58. ... raise AssertionError('MultipleInvalid not raised')
  59. ... except MultipleInvalid as e:
  60. ... exc = e
  61. >>> str(exc) == "required key not provided @ data['q']"
  62. True
  63. ```
  64. ...must be a string:
  65. ```pycon
  66. >>> try:
  67. ... schema({'q': 123})
  68. ... raise AssertionError('MultipleInvalid not raised')
  69. ... except MultipleInvalid as e:
  70. ... exc = e
  71. >>> str(exc) == "expected str for dictionary value @ data['q']"
  72. True
  73. ```
  74. ...and must be at least one character in length:
  75. ```pycon
  76. >>> try:
  77. ... schema({'q': ''})
  78. ... raise AssertionError('MultipleInvalid not raised')
  79. ... except MultipleInvalid as e:
  80. ... exc = e
  81. >>> str(exc) == "length of value must be at least 1 for dictionary value @ data['q']"
  82. True
  83. >>> schema({'q': '#topic'}) == {'q': '#topic', 'per_page': 5}
  84. True
  85. ```
  86. "per\_page" is a positive integer no greater than 20:
  87. ```pycon
  88. >>> try:
  89. ... schema({'q': '#topic', 'per_page': 900})
  90. ... raise AssertionError('MultipleInvalid not raised')
  91. ... except MultipleInvalid as e:
  92. ... exc = e
  93. >>> str(exc) == "value must be at most 20 for dictionary value @ data['per_page']"
  94. True
  95. >>> try:
  96. ... schema({'q': '#topic', 'per_page': -10})
  97. ... raise AssertionError('MultipleInvalid not raised')
  98. ... except MultipleInvalid as e:
  99. ... exc = e
  100. >>> str(exc) == "value must be at least 1 for dictionary value @ data['per_page']"
  101. True
  102. ```
  103. "page" is an integer \>= 0:
  104. ```pycon
  105. >>> try:
  106. ... schema({'q': '#topic', 'per_page': 'one'})
  107. ... raise AssertionError('MultipleInvalid not raised')
  108. ... except MultipleInvalid as e:
  109. ... exc = e
  110. >>> str(exc)
  111. "expected int for dictionary value @ data['per_page']"
  112. >>> schema({'q': '#topic', 'page': 1}) == {'q': '#topic', 'page': 1, 'per_page': 5}
  113. True
  114. ```
  115. ## Defining schemas
  116. Schemas are nested data structures consisting of dictionaries, lists,
  117. scalars and *validators*. Each node in the input schema is pattern
  118. matched against corresponding nodes in the input data.
  119. ### Literals
  120. Literals in the schema are matched using normal equality checks:
  121. ```pycon
  122. >>> schema = Schema(1)
  123. >>> schema(1)
  124. 1
  125. >>> schema = Schema('a string')
  126. >>> schema('a string')
  127. 'a string'
  128. ```
  129. ### Types
  130. Types in the schema are matched by checking if the corresponding value
  131. is an instance of the type:
  132. ```pycon
  133. >>> schema = Schema(int)
  134. >>> schema(1)
  135. 1
  136. >>> try:
  137. ... schema('one')
  138. ... raise AssertionError('MultipleInvalid not raised')
  139. ... except MultipleInvalid as e:
  140. ... exc = e
  141. >>> str(exc) == "expected int"
  142. True
  143. ```
  144. ### URL's
  145. URL's in the schema are matched by using `urlparse` library.
  146. ```pycon
  147. >>> from voluptuous import Url
  148. >>> schema = Schema(Url())
  149. >>> schema('http://w3.org')
  150. 'http://w3.org'
  151. >>> try:
  152. ... schema('one')
  153. ... raise AssertionError('MultipleInvalid not raised')
  154. ... except MultipleInvalid as e:
  155. ... exc = e
  156. >>> str(exc) == "expected a URL"
  157. True
  158. ```
  159. ### Lists
  160. Lists in the schema are treated as a set of valid values. Each element
  161. in the schema list is compared to each value in the input data:
  162. ```pycon
  163. >>> schema = Schema([1, 'a', 'string'])
  164. >>> schema([1])
  165. [1]
  166. >>> schema([1, 1, 1])
  167. [1, 1, 1]
  168. >>> schema(['a', 1, 'string', 1, 'string'])
  169. ['a', 1, 'string', 1, 'string']
  170. ```
  171. However, an empty list (`[]`) is treated as is. If you want to specify a list that can
  172. contain anything, specify it as `list`:
  173. ```pycon
  174. >>> schema = Schema([])
  175. >>> try:
  176. ... schema([1])
  177. ... raise AssertionError('MultipleInvalid not raised')
  178. ... except MultipleInvalid as e:
  179. ... exc = e
  180. >>> str(exc) == "not a valid value"
  181. True
  182. >>> schema([])
  183. []
  184. >>> schema = Schema(list)
  185. >>> schema([])
  186. []
  187. >>> schema([1, 2])
  188. [1, 2]
  189. ```
  190. ### Validation functions
  191. Validators are simple callables that raise an `Invalid` exception when
  192. they encounter invalid data. The criteria for determining validity is
  193. entirely up to the implementation; it may check that a value is a valid
  194. username with `pwd.getpwnam()`, it may check that a value is of a
  195. specific type, and so on.
  196. The simplest kind of validator is a Python function that raises
  197. ValueError when its argument is invalid. Conveniently, many builtin
  198. Python functions have this property. Here's an example of a date
  199. validator:
  200. ```pycon
  201. >>> from datetime import datetime
  202. >>> def Date(fmt='%Y-%m-%d'):
  203. ... return lambda v: datetime.strptime(v, fmt)
  204. ```
  205. ```pycon
  206. >>> schema = Schema(Date())
  207. >>> schema('2013-03-03')
  208. datetime.datetime(2013, 3, 3, 0, 0)
  209. >>> try:
  210. ... schema('2013-03')
  211. ... raise AssertionError('MultipleInvalid not raised')
  212. ... except MultipleInvalid as e:
  213. ... exc = e
  214. >>> str(exc) == "not a valid value"
  215. True
  216. ```
  217. In addition to simply determining if a value is valid, validators may
  218. mutate the value into a valid form. An example of this is the
  219. `Coerce(type)` function, which returns a function that coerces its
  220. argument to the given type:
  221. ```python
  222. def Coerce(type, msg=None):
  223. """Coerce a value to a type.
  224. If the type constructor throws a ValueError, the value will be marked as
  225. Invalid.
  226. """
  227. def f(v):
  228. try:
  229. return type(v)
  230. except ValueError:
  231. raise Invalid(msg or ('expected %s' % type.__name__))
  232. return f
  233. ```
  234. This example also shows a common idiom where an optional human-readable
  235. message can be provided. This can vastly improve the usefulness of the
  236. resulting error messages.
  237. ### Dictionaries
  238. Each key-value pair in a schema dictionary is validated against each
  239. key-value pair in the corresponding data dictionary:
  240. ```pycon
  241. >>> schema = Schema({1: 'one', 2: 'two'})
  242. >>> schema({1: 'one'})
  243. {1: 'one'}
  244. ```
  245. #### Extra dictionary keys
  246. By default any additional keys in the data, not in the schema will
  247. trigger exceptions:
  248. ```pycon
  249. >>> schema = Schema({2: 3})
  250. >>> try:
  251. ... schema({1: 2, 2: 3})
  252. ... raise AssertionError('MultipleInvalid not raised')
  253. ... except MultipleInvalid as e:
  254. ... exc = e
  255. >>> str(exc) == "extra keys not allowed @ data[1]"
  256. True
  257. ```
  258. This behaviour can be altered on a per-schema basis. To allow
  259. additional keys use
  260. `Schema(..., extra=ALLOW_EXTRA)`:
  261. ```pycon
  262. >>> from voluptuous import ALLOW_EXTRA
  263. >>> schema = Schema({2: 3}, extra=ALLOW_EXTRA)
  264. >>> schema({1: 2, 2: 3})
  265. {1: 2, 2: 3}
  266. ```
  267. To remove additional keys use
  268. `Schema(..., extra=REMOVE_EXTRA)`:
  269. ```pycon
  270. >>> from voluptuous import REMOVE_EXTRA
  271. >>> schema = Schema({2: 3}, extra=REMOVE_EXTRA)
  272. >>> schema({1: 2, 2: 3})
  273. {2: 3}
  274. ```
  275. It can also be overridden per-dictionary by using the catch-all marker
  276. token `extra` as a key:
  277. ```pycon
  278. >>> from voluptuous import Extra
  279. >>> schema = Schema({1: {Extra: object}})
  280. >>> schema({1: {'foo': 'bar'}})
  281. {1: {'foo': 'bar'}}
  282. ```
  283. However, an empty dict (`{}`) is treated as is. If you want to specify a list that can
  284. contain anything, specify it as `dict`:
  285. ```pycon
  286. >>> schema = Schema({}, extra=ALLOW_EXTRA) # don't do this
  287. >>> try:
  288. ... schema({'extra': 1})
  289. ... raise AssertionError('MultipleInvalid not raised')
  290. ... except MultipleInvalid as e:
  291. ... exc = e
  292. >>> str(exc) == "not a valid value"
  293. True
  294. >>> schema({})
  295. {}
  296. >>> schema = Schema(dict) # do this instead
  297. >>> schema({})
  298. {}
  299. >>> schema({'extra': 1})
  300. {'extra': 1}
  301. ```
  302. #### Required dictionary keys
  303. By default, keys in the schema are not required to be in the data:
  304. ```pycon
  305. >>> schema = Schema({1: 2, 3: 4})
  306. >>> schema({3: 4})
  307. {3: 4}
  308. ```
  309. Similarly to how extra\_ keys work, this behaviour can be overridden
  310. per-schema:
  311. ```pycon
  312. >>> schema = Schema({1: 2, 3: 4}, required=True)
  313. >>> try:
  314. ... schema({3: 4})
  315. ... raise AssertionError('MultipleInvalid not raised')
  316. ... except MultipleInvalid as e:
  317. ... exc = e
  318. >>> str(exc) == "required key not provided @ data[1]"
  319. True
  320. ```
  321. And per-key, with the marker token `Required(key)`:
  322. ```pycon
  323. >>> schema = Schema({Required(1): 2, 3: 4})
  324. >>> try:
  325. ... schema({3: 4})
  326. ... raise AssertionError('MultipleInvalid not raised')
  327. ... except MultipleInvalid as e:
  328. ... exc = e
  329. >>> str(exc) == "required key not provided @ data[1]"
  330. True
  331. >>> schema({1: 2})
  332. {1: 2}
  333. ```
  334. #### Optional dictionary keys
  335. If a schema has `required=True`, keys may be individually marked as
  336. optional using the marker token `Optional(key)`:
  337. ```pycon
  338. >>> from voluptuous import Optional
  339. >>> schema = Schema({1: 2, Optional(3): 4}, required=True)
  340. >>> try:
  341. ... schema({})
  342. ... raise AssertionError('MultipleInvalid not raised')
  343. ... except MultipleInvalid as e:
  344. ... exc = e
  345. >>> str(exc) == "required key not provided @ data[1]"
  346. True
  347. >>> schema({1: 2})
  348. {1: 2}
  349. >>> try:
  350. ... schema({1: 2, 4: 5})
  351. ... raise AssertionError('MultipleInvalid not raised')
  352. ... except MultipleInvalid as e:
  353. ... exc = e
  354. >>> str(exc) == "extra keys not allowed @ data[4]"
  355. True
  356. ```
  357. ```pycon
  358. >>> schema({1: 2, 3: 4})
  359. {1: 2, 3: 4}
  360. ```
  361. ### Recursive schema
  362. There is no syntax to have a recursive schema. The best way to do it is to have a wrapper like this:
  363. ```pycon
  364. >>> from voluptuous import Schema, Any
  365. >>> def s2(v):
  366. ... return s1(v)
  367. ...
  368. >>> s1 = Schema({"key": Any(s2, "value")})
  369. >>> s1({"key": {"key": "value"}})
  370. {'key': {'key': 'value'}}
  371. ```
  372. ### Extending an existing Schema
  373. Often it comes handy to have a base `Schema` that is extended with more
  374. requirements. In that case you can use `Schema.extend` to create a new
  375. `Schema`:
  376. ```pycon
  377. >>> from voluptuous import Schema
  378. >>> person = Schema({'name': str})
  379. >>> person_with_age = person.extend({'age': int})
  380. >>> sorted(list(person_with_age.schema.keys()))
  381. ['age', 'name']
  382. ```
  383. The original `Schema` remains unchanged.
  384. ### Objects
  385. Each key-value pair in a schema dictionary is validated against each
  386. attribute-value pair in the corresponding object:
  387. ```pycon
  388. >>> from voluptuous import Object
  389. >>> class Structure(object):
  390. ... def __init__(self, q=None):
  391. ... self.q = q
  392. ... def __repr__(self):
  393. ... return '<Structure(q={0.q!r})>'.format(self)
  394. ...
  395. >>> schema = Schema(Object({'q': 'one'}, cls=Structure))
  396. >>> schema(Structure(q='one'))
  397. <Structure(q='one')>
  398. ```
  399. ### Allow None values
  400. To allow value to be None as well, use Any:
  401. ```pycon
  402. >>> from voluptuous import Any
  403. >>> schema = Schema(Any(None, int))
  404. >>> schema(None)
  405. >>> schema(5)
  406. 5
  407. ```
  408. ## Error reporting
  409. Validators must throw an `Invalid` exception if invalid data is passed
  410. to them. All other exceptions are treated as errors in the validator and
  411. will not be caught.
  412. Each `Invalid` exception has an associated `path` attribute representing
  413. the path in the data structure to our currently validating value, as well
  414. as an `error_message` attribute that contains the message of the original
  415. exception. This is especially useful when you want to catch `Invalid`
  416. exceptions and give some feedback to the user, for instance in the context of
  417. an HTTP API.
  418. ```pycon
  419. >>> def validate_email(email):
  420. ... """Validate email."""
  421. ... if not "@" in email:
  422. ... raise Invalid("This email is invalid.")
  423. ... return email
  424. >>> schema = Schema({"email": validate_email})
  425. >>> exc = None
  426. >>> try:
  427. ... schema({"email": "whatever"})
  428. ... except MultipleInvalid as e:
  429. ... exc = e
  430. >>> str(exc)
  431. "This email is invalid. for dictionary value @ data['email']"
  432. >>> exc.path
  433. ['email']
  434. >>> exc.msg
  435. 'This email is invalid.'
  436. >>> exc.error_message
  437. 'This email is invalid.'
  438. ```
  439. The `path` attribute is used during error reporting, but also during matching
  440. to determine whether an error should be reported to the user or if the next
  441. match should be attempted. This is determined by comparing the depth of the
  442. path where the check is, to the depth of the path where the error occurred. If
  443. the error is more than one level deeper, it is reported.
  444. The upshot of this is that *matching is depth-first and fail-fast*.
  445. To illustrate this, here is an example schema:
  446. ```pycon
  447. >>> schema = Schema([[2, 3], 6])
  448. ```
  449. Each value in the top-level list is matched depth-first in-order. Given
  450. input data of `[[6]]`, the inner list will match the first element of
  451. the schema, but the literal `6` will not match any of the elements of
  452. that list. This error will be reported back to the user immediately. No
  453. backtracking is attempted:
  454. ```pycon
  455. >>> try:
  456. ... schema([[6]])
  457. ... raise AssertionError('MultipleInvalid not raised')
  458. ... except MultipleInvalid as e:
  459. ... exc = e
  460. >>> str(exc) == "not a valid value @ data[0][0]"
  461. True
  462. ```
  463. If we pass the data `[6]`, the `6` is not a list type and so will not
  464. recurse into the first element of the schema. Matching will continue on
  465. to the second element in the schema, and succeed:
  466. ```pycon
  467. >>> schema([6])
  468. [6]
  469. ```
  470. ## Running tests.
  471. Voluptuous is using nosetests:
  472. $ nosetests
  473. ## Why use Voluptuous over another validation library?
  474. **Validators are simple callables**
  475. : No need to subclass anything, just use a function.
  476. **Errors are simple exceptions.**
  477. : A validator can just `raise Invalid(msg)` and expect the user to get
  478. useful messages.
  479. **Schemas are basic Python data structures.**
  480. : Should your data be a dictionary of integer keys to strings?
  481. `{int: str}` does what you expect. List of integers, floats or
  482. strings? `[int, float, str]`.
  483. **Designed from the ground up for validating more than just forms.**
  484. : Nested data structures are treated in the same way as any other
  485. type. Need a list of dictionaries? `[{}]`
  486. **Consistency.**
  487. : Types in the schema are checked as types. Values are compared as
  488. values. Callables are called to validate. Simple.
  489. ## Other libraries and inspirations
  490. Voluptuous is heavily inspired by
  491. [Validino](http://code.google.com/p/validino/), and to a lesser extent,
  492. [jsonvalidator](http://code.google.com/p/jsonvalidator/) and
  493. [json\_schema](http://blog.sendapatch.se/category/json_schema.html).
  494. I greatly prefer the light-weight style promoted by these libraries to
  495. the complexity of libraries like FormEncode.