Domains & Types » Publishing » Discuss

Discussions on Publishing

Filtersonly show threads also posted in:

  1.  

    Possible revision for Journal Articles

    1. Spurred on by an aside that spatialed made in some discussion post awhile back (I can't locate the post), I'm considering revising the way that we model journal issues.  Currently, each issue has its own topic, which links to both the journal and the articles contained in that issue. (See
      http://www.freebase.com/type/schema/book/journal_issue)  The main problems with this format are that it is cumbersome to enter data, and also that most bibliographic sources are concerned primarily with the article and the journal, relegating the issue to a series of strings (volume, issue, date). This latter issue might make integration with standard bibliographic schemas a bit cumbersome, although it wouldn't be insurmountable.

      As an experiment, though, I thought I'd try to see what a model that eliminated the issue type entirely looked like.  Here are the results:
      http://sandbox.freebase.com/view/guid/9202a8c04000641f8000000008cd1edd.

      I've replaced the issue type with a CVT that connects the article and the journal, and includes the standard bibliographic data of Volume, issue, date, issue date extra, and pages ("issue date extra" is something I had to make up for journals that aren't published on a schedule that translates into mm/dd/yyyy).  Journal articles have both Scholarly Work and Written Work as included types, although a journal article can also be a review, editorial, letter or other type of writing.

      The only real disadvantage that I see to this is that constructing the contents of a given issue will be harder -- users will have to query on a combination of several fields (volume, issue, etc.) to find what they're looking for.

      I'd love to hear what people think about this. If it seems to work, I might do the same thing for newspaper issues.

      This is also being discussed on the data-modelers mailing list: http://lists.freebase.com/pipermail/data-modeling/2008-July/000989.html

      1. I think it might be going too far to eliminate the issue type, although I agree that for most purposes the article is what matters. Some journal issues deal with a set theme or with responses to a target article. For example, The American Journal of Bioethics always sets one or more target articles that other contributors respond to. So it might be useful to keep journal issue as its own type to enable tacking this kind of issue-level information. I don't know the system well enough yet to know whether this information can be stored in a CVT or if it needs a standard type. It might make more sense to eliminate the issue type for newspapers, since generally there is no theme to the content of a single newspaper issue.

      2. I see your point about the theme and target-article issues, and why it would be useful to be able to view the contents at the issue level in those cases. Although if you know the issue/volume/date information, you could use the filter view of the CVT to recreate a table of contents (there would be nothing anywhere that said, however, that "This is the X-themed issue of Y journal". I am more than a bit concerned about using both schemata simultaneously -- I worry both that the two schemata would fall out of synch very quickly, making it harder for users to use the bibliographic data, and also that someone might figure out a way to keep them in synch (via some automated process), which would eat up a lot of extra space.

        I'd be curious to hear more of your thoughts on this.

        I'll repost the schema to sandbox either later today or tomorrow (the data on sandbox is due to be refreshed in an hour or so).

      3. The proposed schema is on sanbox again, here this time:

        https://sandbox.freebase.com/type/view/book/journal_publication

      4. I like the way the sandbox version looks, but I'm not sure I understand the concern about the schemata coming apart if you have both and issue type and an article type. You  wouldn't  store the issue information separately in the article type, you'd just link to the issue, and the issue in turn links to the Journal.

         i.e.,

        Object of Article type has a property "Published in" linking it to an object of Journal Issue type

        Object of Journal Issue type has a property "Journal" linking it to a journal.

        What I'm not sure about, because I'm new to freebase, is whether there is a good way to display the information if links are chained in this way. If you set up the types in the way I've suggested, can you look at a journal article and see both the issue and the journal that that issue links to?

        Maybe there's something I'm not getting about your concern about schema getting out of sync. Please let me know as I'm anxious to learn more about how things work around here.

      5. The issue isn't about having both an article type and an issue type -- we already have those (see the help topic Entering Scholarly Works and Citations for an explanation of the current model). The issue is how to best represent the publication information of an article -- whether by omitting the issue type entirlely, and just using a CVT to capture the basic bibliographic information, like you would see in a bibliography (or bibtex or similar format), or whether to have an explicit topic for every journal article. 

        There is not, alas, a way to display the information in chained links. You can display disambiguating properties from the expected type of a property only. So in the sandbox version, the volume, issue, etc. are properties on the Journal Publication type, and so can be displayed on the article and journal types the connect to it.  But the way we otherwise handle publications is by using the "published work" and "publication" types, which have to be set as co-types on the article and issue.  Because the expected type is not explicitly "journal issue", properties on "journal issue" are not displayed on the article or journal.

        Let me know if this doesn't make sense.

      6. Okay, let me see if I understand. Currently, you can enter the journal article in the contents of a journal issue where it links in as a publication. And in published work type, there is the "published in" property which gets filled in with the journal issue. So there is no specific Journal Article type because there doesn't need to be one separate from publication. Sorry about the confusion.

        So now I'm having trouble figuring out what to do if you want to search for only journal articles? I start by searching for published work, but when I try to filter it by the "published in property", it wants me to put in the name of a specific publication, it doesn't accept the type Journal Issue. I suppose that the sandbox version would solve this problem, but at the expense of not being able to record issue level information. It might still be more user friendly to have a journal article type under the current system even if the journal article type has no extra properties above and beyond published work. It would simply allow for distinguishing journal articles  from other publications at a glance.

      7. At that point, Journal Article would be nothing more that what we sometimes call a "bucket" -- a type that has no properties, and is therefore just a place to collect things that can be said to be of that type. There are a number of these here and there, but we try to avoid them as much as possible.  In order for it to be useful, users would have to remember to add it themselves to each topic that is published in a journal; we're already asking users to manually add a lot of types in the publishing domain, just to use the basic functionality, and I have doubts about how much this type would ever be used.

        I can't think of a way, through the UI, to filter out only those works that were published in Journal Issues. How often do you think people would be looking for journal issues but not other types of academic or scholarly papers?

      8. You could make a Journal Article type that includes the published work type. then make the contents of the Journal Issue type link to Journal article. That way any new objects that get linked to from a Journal Issue will automatically be created as Journal Article types. So users won't have to manually add the Journal Article type. The problem is that all existing articles that are presently of the published work type would have to be migrated over to being of the journal article type.

      9. If Jounal Article were linked directly to Journal Issue, it would remove the need for co-typing it with Published Work.  But otherwise, what you suggest would work. But if we do keep the existing Journal Issue model (as opposed to the CVT version under review), I'd be reluctant to undermine the existing Published Work/Publication model by creating a special type that used a different method of publication. (I realize that I'm already proposing this with the CVT model, and I do recognize the inconsistency; I guess my argument is that if the needs of modeling Journal Articles are significantly different from other types of publications the different model is justified, but if it's just a question of copying the current model onto different types, I'm not sure that it is justified.)
      10. I typically lean towards the "let's describe every detail" side of data modeling but the journal article CVT has some real appeal to me. It is simple and quick. Add a couple numbers and you are done, which is the case of most bibliographic software for good reason. Otherwise, users are forced to add redundant journal name when describing an issue. For example, see Conservation Biology volume 19 issue 3. That just seems silly to me and ripe for inconsistent data entry that will be difficult to summarize. However, there are also cases, as avic has mentioned, when an issue focuses on a special topic and may be presented as a collection. In those cases, it would be very useful to be able to search through a list of special topic issues. None of the suggested models seems to permit this kind of summary. So... how about another option that may include a bit of denormalization? How about we keep both issue models? The CVT can be used as the default but if the user wants to add more detail for the issue, there is a separate "issue details" property (or something similar) that links to a "Detailed journal issue" type with properties that allow users to describe special subjects, editors, cover art, etc. and reciprocated links with journal and articles. Alternatively, the CVT could hold the data and just not include them as disambiguators. Another option would be to somehow automate the naming (and possible renaming) of new issue types when data are added to the journal article CVT. For example, the above mentioned Conservation Biology volume 19 issue 3 topic could be created or updated through scripting after a user enters or changes journal, volume, and issue info on an article page. The new issue topic could then be added to the "detailed issue" property of the article topic. Just throwing some ideas out there - fire away.
      11. To dispense with two of Ed's suggestions quickly (sorry Ed!): Fancy auto-naming isn't possible at the moment, and it's extremely difficult to get to CVTs through the UI, so non-disambiguating properties on CVTs are not especially useful.

        Denormalization is a possibility, though. We could add the CVT for bibliographic info, and keep the Journal Issue type pretty much as it is, but maybe with some tweaks -- add a subject property, maybe change the name of the property on the Journal type to "special subject issues" or something. Journal articles could still be included via the publication/contents property.

      12. It was worth a shot. :)

        I vote for denormalization (i.e., keep the CVT and add a separate property for issue topics). I don't think it is necessary to limit the journal issue topics to those that are special. There should be a property of the journal issue to indicate if it is a special issue and what the special issue subject is, but someone may want to describe all issues of a journal one day. 

      13. The more I think about it, the less I'm comfortable with the denormalization. I think we should model for the large majority of cases, and just use the CVT for now.  (Stuff can always be added later, if there turns out to be a demand after we start getting journal articles in any kind of quantity.)

    Discussion is posted in:

    Think this discussion also relates to something else? Cross-post it by adding a new discussion area:

  2.  

    Genre parents/children

    1. Should literary genre include media genre, or perhaps delegate some properties from it?  I expected to find parent/child genres on literary genre, but there weren't any.
      1. It does include media genre -- where did you see the problem?
      2. Ah, i was looking at the type page for Literary Genre and saw that most of them weren't co-typed, and just assumed.  I performed the co-typing for those that were there, then posted.  But I guess the inclusion exists but wasn't back-ported onto existing instances.  My error.

    Discussion is posted in:

    Think this discussion also relates to something else? Cross-post it by adding a new discussion area:

  3.  

    Book edition properties

    1. I wonder whether there should be an optional editor property for book editions. I just added the book, The Right and the Good, by D.W. Ross. It was originally published in 1930 with no editor. It was re-published in 2002. The 2002 edition was edited by Philip Stratton-Lake. Given the current schema I had to put Philip Stratton-Lake as editor of the book, but this is misleading because it makes it appear that Stratton-Lake was the  editor of the 1930 original.
      1. This is a good idea; I think we had a model that captured this at one point, but it got lost in a refactoring.  The only issue I have before just adding the property is what the reciprocal property should be on the Author type. There are already properties for "Works Edited" and "Book editions published", either one of which could be confused with the new property. Any thoughts?  One thing we could do, I suppose, is create a new type (Book edition editor or something), but I'd like to exhaust other possibilities first.
      2. I'm not sure why we need the "book editions published" property as a separate property. Why not split this up into two properties, "book editions authored" and "book editions edited"?
      3. Interesting point; if we did so, we would actually be storing two different kinds of assertions in the "book editions edited" property. One (the one used by "book editions authored) for the various editions of a book that the person has edited (e.g., Gardner Dozois edited "The Year's Best Science Fiction, 23rd Annual Collection", regardless of edition), and one for editions in which the editor has edited only that edition (e.g., the 2002 "The Right and the Good"), which might be confusing. (E.g., if you didn't know anything about the book, and saw that Stratton-Lake edited an edition of The Right and the Good, how could you be certain that he edited only that edition, and that his absence from the Book topic of The Right and the Good wasn't just an oversight?) Or am I just overthinking this?
      4. I think there should be an "edited by" property in the Book type which is for editors of a book regardless of the specific edition (such as Gardner Dozois in your example) and a different "edited by" property in the Book Edition type for editors who edited a specific edition of the work, but who are not editors of the work itself (for example, Stratton-Lake).

         

        If it is too confusing to have two different properties by the same name, you could call the book-level property "work edited by" and the edition-level property "edition edited by". You could have a note in the "edition edited by" property description telling people that they need not duplicate entries that already applly to works as a whole.

         

        Likely the "work edited by" property would be best used for works like compilaitons, anthologies, encyclopedias, and, topic focused academic books which have different authors for each chapter. In contrast, the "edition edited by" property would be used for works that were not initially edited works, but that have been re-published in a new edition with added scholarly apparatous and/or an editorial introduction.


    Discussion is posted in:

    Think this discussion also relates to something else? Cross-post it by adding a new discussion area:

  4.  

    Feild of study

    also posted to
    1. Academic books should have a property linking them to a feild of study, e.g., philosophy, psychology, etc.
      1. All books have a subject property, which seems like it should be sufficient. (The property is actually on the type "written work", but all books should have that type as a co-type.)
      2. Thank you, you're right that does seem to cover it. I'm still getting used to the way that co-typeing works (i.e., that something can be both a feild of study type and a book subject type).

    Discussion is posted in:

    Think this discussion also relates to something else? Cross-post it by adding a new discussion area:

  5.  

    Fair use of academic journal indexes

    1. It would be great to have some serious bibliographic data on academic journal articles. Would it be within copyright law to download biblographic data from a proprietary index and then use a custom script to extract and repackege the data into a freebase friendly form and then to bulk upload the lot?

      It seems to me that as long as the information in the index which is uploaded is information that is available freely to be public, this should not be a violation of copyright. Obviously a distinction would have to be made between those indexes which simply provide publically available bibliographic data and those which actually generate content that could be considered proprietary (e.g., an index which  provides an abstracting service).

       Would any copyright experts care to weigh in on this?


    Discussion is posted in:

    Think this discussion also relates to something else? Cross-post it by adding a new discussion area:

  6.  

    Journal article type

    also posted to
    1. There doesn't appear to be a type for journal article. There is journal issue, but this doesn't quite cover it. It would be very useful for academic research to have a journal article type which would be for a specific journal article. 


    Discussion is posted in:

    Think this discussion also relates to something else? Cross-post it by adding a new discussion area:

  7.  

    Newspaper: Accomodate Weeklies

    1. Can we remove the "daily" specific nature of the Newspaper Type so we can use this for weeklies?
      1. Good point.  The issue then is how to model things like price and circulation in a way that accomodates weekly papers as well as those that have both daily and sunday editions.  We could do a CVT with value, date, and frequency (or maybe call it edition?). Papers with only one edition could optionally leave frequency blank, I suppose.  Any other thoughts?
      2. That would also help modeling papers with morning and evening editions (the Providence Journal-Bulletin had this until very recently).
      3. I was thinking something like that, where we combined the fields into one. 

        And I had no idea there were papers that still did morning/evening editions!

      4. It could also address local vs. national editions (like the NY Times), which might be interesting to some people. I'll add a task for myself in the old refactoring queue.
      5. Take a look here: https://www.freebase.com/view/user/jeff/newspaper and see what you think.  I've filled out the San Francisco Chronicle with data in both old and new formats for comparison.

    Discussion is posted in:

    Think this discussion also relates to something else? Cross-post it by adding a new discussion area:

  8.  

    Pubmed Article

    1. An article in pubmed ( http://www.ncbi.nlm.nih.gov/pubmed/ )  can be defined by a journal (issn, abbreviation, title) , an issue ( volume, issue, date ) , a title, a pubmed identidier (pmid) , a doi , a list of authors. I'm lost in all those types. Dear Lazyweb, hat are the types that should be used to create a new instance of an article ?

       

      Thanks

       Pierre

       

      1. For a start, see the help topic Entering Scholarly Works and Citations.  The short answer is that an article should have the types Scholarly Work, Written Work, and Published Work. Between these, you should be able to enter most of the properties you mention.
      2. Thanks !