Home > Help Central > Search Guide |
Our search engine tries to offer today's typical web searching experience, as gained with popular search engines such as Google. The nature of bibliographic searching differs from that of a web page searching, though. We provide many extensions to enable a complex and precise structured search, including an combined metadata, fulltext and reference search in one go. This page lists several tips and tricks that you may find useful to this effect.
The default search mode is simple search that basically provides you with one input box where you can type your query, followed by a possibility to choose one of the common indexes to search within. You would usually simply type the keywords you are interested in and hit return. For example, if you are interested in documents on standard model that are written by (or mention) Ellis, you would type:
and on the search results page you could further add/remove keywords to get more precisely at what you are looking for, as is mentioned below.
The advanced search interface provides you with explicit tools to play with: you can change the matching type from the default word matching to phrase searching or the regular matching; you can use boolean queries in several indexes, etc. For example, to find all the documents written by Ellis, J spelled exactly that way that contain either of the words muon or neutrino in the title and that were published in 2001, you would type:
Note that Simple Search can provide you basically the same functionality, if you make use of special syntax that is explained in the text below. The simple-versus-advanced does not refer to the functionality that is being provided but rather to the amount of parametrization you can "tweak". We conform to the common use of the simple/advanced terms as found in other search engines.
Much of what follows will deal with a question on "how a power user would use the simple search interface". Recall that you can always go to the Advanced Search for more query assistance.
After you submit your query, the search engine will analyze it and will try to always guide you in case no exact match could be found. For example, it would print you a list of closest indexed terms in case of spelling troubles:
Alternative choices will be printed in red. The search engine will similarly warn you when your search terms could not be found, or when they could but your boolean query couldn't be met. The search engine will also silently try to search for alternative forms (e.g. remove punctuation), etc.
Thanks to multiple search stages and the guidance provided at each stage, it is usually sufficient to simple type what you are looking for and see what the system says in return. If you aren't satisfied, you would then add/remove words from your query until the satisfactory reply.
The default search mode is a search for words. This means that any whitespace you type is not significant, but is rather interpreted to mean "add an automatic boolean AND between words", like Google does. For example, to find all records that contain both the word ellis and the word muon anywhere in the record, type:
The whitespace would be significant if you include it within quotes. There are two phrase searching modes:The difference between exact and partial phrase searching modes may not be obvious upon first look. While the latter is more similar to what ``phrase search'' usually means in the context of web page search engines, the former one is usually an order of magnitude faster if you know the precise values you are looking for.
(Note: For some indexes such as any field, title, or abstract, there is no distinction between searching for double quoted and single quoted expressions. Both behave the same usual way.)
Another interesting searching mode besides the word and phrase
searches is the regular expression search,
introduced by slashes instead of quotes. For example, the above
partial phrase query 'muon decay'
is fully
equivalent to the regular expression query /muon
decay/
. The regular expression syntax is very powerful
and permits you to construct very complex queries. For more
information, please consult the regular
expression section of this guide.
+
ANDellis +muon
matches all records that contain both the word ellis and the the word muon ellis muon
ditto, syntactic sugar ellis and muon
ditto, syntactic sugar -
NOTellis -muon
matches all records that contain the word ellis but that do not contain the word muon ellis not muon
ditto, syntactic sugar |
ORellis |muon
matches all records that contain at least one of the words ellis or muon
ditto, syntactic sugar
Logical operations are automatically chained from left to right. For example, if you want to search for documents written by Ellis on muons or kaons, write:
which looks for(muon or kaon) and ellis
. Note that
this gives different results from:
which would search for (ellis and muon) or kaon
.
The left-to-right chaining behaviour permits you to easily
refine your search by adding/removing words with and/not or +/-
operators. For example, to exclude the documents on decay from
the above search, append -decay
:
This query returns records containing either gravity or supergravity, and either ellis or perelstein anywhere in the record.
Note that you can use any number of parentheses in the query.
Nested parentheses, such as foo AND (bar OR (fuux NOT
quux))
, are also supported.
When indexing words, an attention is paid to index it both with and without punctuation, so that you should be able to search for terms containing special characters, such as C++, verbatim:
For example, to find records containing the LaTeX expression$e^{+}e^{-}$
in the title, type:
For example, to find document with the report number
hep-ph/0204133, type:
Note that the search is case-insensitive:
The search engine works with Unicode UTF-8 so you can type your query strings in any language stored in the database. For example, to find the documents written by (or on) Пушкин, type:
Note that you don't have to type accents to find accented results. For example, typeLemaitre
to find papers
by Lemaître:
The word truncation is supported via asterisk (*) wildcard character. The wildcard instructs the search engine to match any number of characters in that place. For example, to find records that contain words muon, muons, muonic etc, type:
The wildcard query works both in prefix and infix position. For example, to get all the words that start by CERN-TH and end by 31, type: Note that the wildcard will be ignored if you try to apply it to very short words, such as a*: The wildcard character can be used also in the phrase searching mode. For example, to find all the documents whose title starts by "Neutrino mass", type: Recall that we have introduced exact and partial phrase search modes. Actually, a partial phrase search mode launches an exact search enclosed within wildcards: we could say that'foo bar
baz'
equals to "*foo bar baz*"
. Now you can
see why the partial phrase search is slow: due to the usage of two
asterisks in front and after the text, each and every title in the
database has to be looked up to determine whether it matches or
not. (There are currently no partial phrase indexes.)
Searching within various bibliograpic fields (such as title,
author) is supported via Google's "site:"
like
syntax. If a search term is preceded by a field name and a
colon, then the term is searched for inside this field only.
For example, to find documents containing the
word ellis within author index, type:
author
, title
,
reportnumber
, abstract
,
keyword
, year
, experiment
,
fulltext
, and reference
.
The regular expression searching mode is mostly for the power users acquainted with the traditional Unix/POSIX regexp syntax. In the Simple Search interface you can trigger it by using slashes instead of quotes:
while in the Advanced Search interface you can select the matching type explicitely by using the selection box menu. The above example will find all the titles that start by the letter E, followed by any number of any characters, and end by the letter s.Another example could be an author search for an author expressed in the database as either Ellis, J or Ellis, John:
The regular expression search enables you to formulate very specific word proximity queries. For example, let us find all titles containing words dense and matter that are separated by at most one word that doesn't contain the letter l:
Note that you can also use character intervals such as
[a-k]
and occurrence counts such
as {3}
. For example, let us find all preprints
that do not follow the year cataloguing policy, that
is YYYY to denote year, optionally followed
by ? or by another -YYYY:
[:alnum:]
, so that the above query
is equivalent to:
To learn more about POSIX regular expressions, please consult the Wikipedia regexp article and the MySQL regexp documentation.
The span query is provided via a ->
sign. For
example, to search for all documents on muon decay published
between 1983 and 1992, type:
All the syntax mentioned above can be combined together in one query. For example, to find documents that have the word ellis inside author fields, that do not contain words like muon, 'muonic' etc in any field, that contain the phrase (or the substring, to be more precise) 'dense quark matter' inside abstract fields, and that were published in year starting by digits '200', type:
Note that the default "any field" global index does contain only the metadata terms, not the citation nor fulltext terms. You have to explicitely mentionfulltext
or
reference
index to search there. For example, to
find the term Higgs in either metadata, references or
fulltext files, type:
This permits an interesting combination of metadata, fulltext and
citation search in the same query. For example, to get all
documents written by Lin whose fulltext files contain the
words Schwarzschild and AdS, and who cite
journal Adv. Theor. Math. Phys., type:
black hole
than
for "black hole"
.and
,
of
, or CERN
.You can search for an author in many ways, each having its own advantages and disadvantages.
Ellis J
within the author index, it means
that two queries (for the words Ellis
and J
) are effected first and a boolean AND is
performed next:
Such a query would match also a document whose first author is Ellis, R and the second author Finch, A J, which is probably not what you wanted. While the search is very fast and you would have found the results for the author you were looking for, such a technique could have returned you many false positives, as the one cited above. Instead of searching for words, a more suitable technique to apply in this case is to search for phrases which will permit you to achieve higher search precisions.
This way of searching gives you the highest precision and no false positives. (Assuming there are no other authors whose names are spelled Ellis, J, an assumption that is often false *.) The search is very fast.
This way of searching still keeps the highest precision and no false positives. (Assuming there are no other authors whose names are spelled Ellis, J or Ellis, John, an assumption that is often false*.) The search is fast.
It would match all author names that start by the text
Ellis, J
, i.e. not only the wanted
forms Ellis, J and Ellis, John, but
also Ellis, Jim, or Ellis, John Rolfe,
or Ellis, Jonathan Richard.
This way of searching returns you more results, which may be suitable in case you don't know how the names are spelled in the database. But you also risk the eventuality of getting false positives. The search is relatively fast.
It would find not only all the authors mentioned above, but
also the ones whose names contain the expression
Ellis, J
anywhere inside the name, such
as De Lellis, Jim. It thus gives you the largest
possible number of hits at the largest risk of false
positives. The search is relatively slow.
(Note though that this way of searching may be very handy in case of compound family names such Pepe-Altarelli, M or 't Hooft, G where a casual user query for Hooft, G would match the wanted author, unlike the methods mentioned above.)
*NOTE:
If you produce your own list of publications and you notice that
sometimes your first name is spelled abbreviated and sometimes
in full, or if you want to identify your publications among
several authors with the same abbreviation, please contact
the administrators
of
You may select a certain field according to which sort the search results, for example to sort the results by main title. However, sometimes you may want to sort by a report number and it happens that your documents have several of them. For example, the report numbers hep-ph/0204140, CERN-TH-2002-069 and RM3-TH-02-4 all denote the same document. Now if you sort your search results set containing this document, the system will take into consideration the first report number, that may be either of these three. Sometimes you may want to classify this document under its hep-ph number, sometimes under its CERN number, depending on whether you produce a list of CERN or hep-ph publications. How can you influence the search engine to prefer one report number rather than the other?
In other words, the search engine by default answers a query
like "sort by first author" or "sort by first report number", but
sometimes you may want to ask the search engine to "sort by first
report number that starts by the text CERN-". The latter
possibility is available via a "silent" sort parameter called
sp
(for "sort pattern") that sorts preferentially
according to the given textual pattern if they can be found.
The parameter is "silent" in a way that it is not present in the
search interface, you have to add it manually to your search
URL.
For example, to get all CERN-TH publications of the year 2001
sorted by their CERN-TH numbers, you would search for
CERN-TH-2001*
within reportnumber
index, and on the search results page, being satisfied with the
results, you would add &sp=CERN-TH
to the URL
to sort the results preferentially by CERN-TH report numbers, to
get a
nicely
sorted list of all CERN-TH 2001 publications.
On the search results page, links to other servers like Google. SPIRES or KEK are automatically proposed in a box entitled "Try your search on". You can simply click on the proposed links to run your query on these search engines.
Note that the links aren't printed if the search engine doesn't support it. For example, SPIRES or KEK cannot search for terms within "any field", so we don't link to them in these cases.
If a metadata record contains some associated fulltext
files, fulltext
index. To search for all records
that contain the term e- in their fulltext files,
type: